yassouali

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Visit Website

Total Products

Software by yassouali

Open Source

awesome-semi-supervised-learning

# Awesome Semi-Supervised Learning [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) [![MIT License](https://img.shields.io/badge/license-MIT-green.svg)](https://opensource.org/licenses/MIT) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com) [![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/yassouali/awesome-semi-supervised-learning/graphs/commit-activity) <img width="300" src="https://i.imgur.com/Ky2jxnj.png" "Awesome!"> A curated list of awesome Semi-Supervised Learning resources. Inspired by [awesome-deep-vision](https://github.com/kjw0612/awesome-deep-vision), [awesome-deep-learning-papers](https://github.com/terryum/awesome-deep-learning-papers), and [awesome-self-supervised-learning](https://github.com/jason718/awesome-self-supervised-learning). ## Background # [<img src="https://i.imgur.com/xXi9N40.png">](https://github.com/yassouali/awesome-semi-supervised-learning/) #### What is Semi-Supervised Learning? It is a special form of classification. Traditional classifiers use only labeled data (feature / label pairs) to train. Labeled instances however are often difficult, expensive, or time consuming to obtain, as they require the efforts of experienced human annotators. Meanwhile unlabeled data may be relatively easy to collect, but there has been few ways to use them. **Semi-supervised learning** addresses this problem by using large amount of unlabeled data, together with the labeled data, to build better classifiers. Because semi-supervised learning requires less human effort and gives higher accuracy, it is of great interest both in theory and in practice. #### How many semi-supervised learning methods are there? Many. Some often-used methods include: EM with generative mixture models, self-training, consistency regularization, co-training, transductive support vector machines, and graph-based methods. And with the advent of deep learning, the majority of these methods were adapted and intergrated into existing deep learning frameworks to take advantage of unlabled data. #### How do semi-supervised learning methods use unlabeled data? Semi-supervised learning methods use unlabeled data to either modify or reprioritize hypotheses obtained from labeled data alone. Although not all methods are probabilistic, it is easier to look at methods that represent hypotheses by *p(y|x)*, and unlabeled data by *p(x)*. Generative models have common parameters for the joint distribution *p(x,y)*. It is easy to see that *p(x)* influences *p(y|x)*. Mixture models with EM is in this category, and to some extent self-training. Many other methods are discriminative, including transductive SVM, Gaussian processes, information regularization, graph-based and the majority of deep learning based methods. Original discriminative training cannot be used for semi-supervised learning, since *p(y|x)* is estimated ignoring *p(x)*. To solve the problem, *p(x)* dependent terms are often brought into the objective function, which amounts to assuming *p(y|x)* and *p(x)* share parameters (source: [SSL Literature Survey.](http://pages.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf)) <figure> <img src="https://i.imgur.com/PJ340SK.png" width="600"> <figcaption>An example of the influence of unlabeled data in semi-supervised learning. (Image source: <a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">Wikipedia</a>) </figcaption> </figure> ## Contributing If you find any errors, or you wish to add some papers, please feel free to contribute to this list by contacting [me](https://yassouali.github.io/) or by creating a [pull request](https://github.com/yassouali/awesome-semi-supervised-learning/pulls) using the following Markdown format: ```markdown - Paper Name. [[pdf]](link) [[code]](link) - Author 1, Author 2, and Author 3. *Conference Year* ``` and adding them to the corresponding markdown file in `files/`.  ## Books - [Semi-Supervised Learning Book](http://www.acad.bg/ebook/ml/MITPress-%20SemiSupervised%20Learning.pdf). Olivier Chapelle, Bernhard Schölkopf, Alexander Zien. *IEEE Transactions on Neural Networks 2009* ## Codebase - [Unified SSL Benchmark: A Unified Semi-supervised learning Benchmark for CV, NLP, and Audio](https://github.com/microsoft/Semi-supervised-learning). - [TorchSSL: A PyTorch-based Toolbox for Semi-Supervised Learning](https://github.com/TorchSSL/TorchSSL). ## Surveys & Overview - [Realistic Evaluation of Deep Semi-Supervised Learning Algorithms](https://arxiv.org/abs/1804.09170). Avital Oliver, Augustus Odena, Colin Raffel, Ekin D. Cubuk, Ian J. Goodfellow. *NeurIPS 2018* - [Semi-Supervised Learning Literature Survey](http://pages.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf). Xiaojin Zhu. *2008* - [An Overview of Deep Semi-Supervised Learning](https://arxiv.org/abs/2006.05278). Yassine Ouali, Céline Hudelot, Myriam Tami. *2020* - [A survey on semi-supervised learning](https://link.springer.com/content/pdf/10.1007/s10994-019-05855-6.pdf). Jesper E Van Engelen, Holger H Hoos. *2020* - [A Survey on Deep Semi-Supervised Learning](https://arxiv.org/pdf/2103.00550.pdf). Xiangli Yang, Zixing Song, Irwin King. *2021* ## Computer Vision - Image Classification: [list of papers here](files/img_classification.md) - Semantic and Instance Segmentation: [list of papers here](files/img_segmentation.md) - Object Detection: [list of papers here](files/obj_detection.md) - Other tasks: [list of papers here](files/cv_other_tasks.md) Note that for Image and Object segmentation tasks, we also include weakly-supervised learning methods, that uses weak labels (eg, image classes) for detection and segmentation. ## NLP #### [List of papers here](files/nlp.md) ## Generative Models & Tasks #### [List of papers here](files/generative_models.md) ## Graph Based SSL #### [List of papers here](files/graph_ssl.md) ## Theory #### [List of papers here](files/theory.md) ## Reinforcement Learning, Meta-Learning & Robotics #### [List of papers here](files/reinforcement_learning.md) ## Regression #### [List of papers here](files/regression.md) ## Other #### [List of papers here](files/other_papers.md) ## Talks - [Semi-Supervised Learning and Unsupervised Distribution Alignment](https://www.youtube.com/watch?v=PXOhi6m09bA). *CS294-158-SP20 UC Berkeley.* - [Semi-supervised learning with GANs](https://www.youtube.com/watch?v=j_-JaMPnhr0). *Pydata, Andreas Merentitis, Carmine Paolino, Vaibhav Singh.* - [Overview of Unsupervised & Semi-supervised learning](https://www.youtube.com/watch?v=tnpXLK_AS_U). *AISC, Shazia Akbar.* - [Semi-Supervised Learning](https://www.youtube.com/watch?v=OMRlnKupsXM), [[slides]](https://www.cs.cmu.edu/%7Etom/10701_sp11/slides/LabUnlab-3-17-2011.pdf). *CMU Machine Learning 10-701, Tom M. Mitchell.* ## Thesis - [Fundamental limitations of semi-supervised learning](https://uwspace.uwaterloo.ca/bitstream/handle/10012/4387/lumastersthesis_electronic.pdf). *Tyler Tian Lu*. - [Semi-Supervised Learning with Graphs](http://pages.cs.wisc.edu/~jerryzhu/pub/thesis.pdf). *Xiaojin Zhu*. - [Semi-Supervised Learning for Natural Language](https://www-cs.stanford.edu/~pliang/papers/meng-thesis.pdf). *Percy Liang*. ## Blogs - [Learning with not Enough Data Part 1: Semi-Supervised Learning](https://lilianweng.github.io/posts/2021-12-05-semi-supervised/). *Lilian Weng*. - [An overview of proxy-label approaches for semi-supervised learning](https://ruder.io/semi-supervised/index.html). *Sebastian Ruder*. - [The Illustrated FixMatch for Semi-Supervised Learning](https://amitness.com/2020/03/fixmatch-semi-supervised/). *Amit Chaudhary*. - [An Overview of Deep Semi-Supervised Learning](https://yassouali.github.io/ml-blog/deep-semi-supervised/). *Yassine Ouali*. - [Semi-Supervised Learning in Computer Vision](https://amitness.com/2020/07/semi-supervised-learning/). *Amit Chaudhary*.

AI & Machine Learning ML Frameworks

1.9K Github Stars

Open Source

pytorch-segmentation

# Semantic Segmentation in PyTorch [![MIT License](https://img.shields.io/badge/license-MIT-green.svg)](https://opensource.org/licenses/MIT) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yassouali/pytorch-segmentation/blob/master/tutorial.ipynb)  - [Semantic Segmentation in PyTorch](#semantic-segmentation-in-pytorch) - [Requirements](#requirements) - [Main Features](#main-features) - [Models](#models) - [Datasets](#datasets) - [Losses](#losses) - [Learning rate schedulers](#learning-rate-schedulers) - [Data augmentation](#data-augmentation) - [Training](#training) - [Inference](#inference) - [Try on colab](#colab) - [Code structure](#code-structure) - [Config file format](#config-file-format) - [Acknowledgement](#acknowledgement)  This repo contains a PyTorch an implementation of different semantic segmentation models for different datasets. ## Requirements PyTorch and Torchvision needs to be installed before running the scripts, together with `PIL` and `opencv` for data-preprocessing and `tqdm` for showing the training progress. PyTorch v1.1 is supported (using the new supported tensoboard); can work with ealier versions, but instead of using tensoboard, use tensoboardX. ```bash pip install -r requirements.txt ``` or for a local installation ```bash pip install --user -r requirements.txt ``` ## Main Features - A clear and easy to navigate structure, - A `json` config file with a lot of possibilities for parameter tuning, - Supports various models, losses, Lr schedulers, data augmentations and datasets, **So, what's available ?** ### Models - (**Deeplab V3+**) Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [[Paper]](https://arxiv.org/abs/1802.02611) - (**GCN**) Large Kernel Matter, Improve Semantic Segmentation by Global Convolutional Network [[Paper]](https://arxiv.org/abs/1703.02719) - (**UperNet**) Unified Perceptual Parsing for Scene Understanding [[Paper]](https://arxiv.org/abs/1807.10221) - (**DUC, HDC**) Understanding Convolution for Semantic Segmentation [[Paper]](https://arxiv.org/abs/1702.08502) - (**PSPNet**) Pyramid Scene Parsing Network [[Paper]](http://jiaya.me/papers/PSPNet_cvpr17.pdf) - (**ENet**) A Deep Neural Network Architecture for Real-Time Semantic Segmentation [[Paper]](https://arxiv.org/abs/1606.02147) - (**U-Net**) Convolutional Networks for Biomedical Image Segmentation (2015): [[Paper]](https://arxiv.org/abs/1505.04597) - (**SegNet**) A Deep ConvolutionalEncoder-Decoder Architecture for ImageSegmentation (2016): [[Paper]](https://arxiv.org/pdf/1511.00561) - (**FCN**) Fully Convolutional Networks for Semantic Segmentation (2015): [[Paper]](https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) ### Datasets - **Pascal VOC:** For pascal voc, first download the [original dataset](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar), after extracting the files we'll end up with `VOCtrainval_11-May-2012/VOCdevkit/VOC2012` containing, the image sets, the XML annotation for both object detection and segmentation, and JPEG images.\ The second step is to augment the dataset using the additionnal annotations provided by [Semantic Contours from Inverse Detectors](http://home.bharathh.info/pubs/pdfs/BharathICCV2011.pdf). First download the image sets (`train_aug`, `trainval_aug`, `val_aug` and `test_aug`) from this link: [Aug ImageSets](https://www.dropbox.com/sh/jicjri7hptkcu6i/AACHszvCyYQfINpRI1m5cNyta?dl=0&lst=), and add them the rest of the segmentation sets in `/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/ImageSets/Segmentation`, and then download new annotations [SegmentationClassAug](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0) and add them to the path `VOCtrainval_11-May-2012/VOCdevkit/VOC2012`, now we're set, for training use the path to `VOCtrainval_11-May-2012` - **CityScapes:** First download the images and the annotations (there is two types of annotations, Fine `gtFine_trainvaltest.zip` and Coarse `gtCoarse.zip` annotations, and the images `leftImg8bit_trainvaltest.zip`) from the official website [cityscapes-dataset.com](https://www.cityscapes-dataset.com/downloads/), extract all of them in the same folder, and use the location of this folder in `config.json` for training. - **ADE20K:** For ADE20K, simply download the images and their annotations for training and validation from [sceneparsing.csail.mit.edu](http://data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip), and for the rest visit the [website](http://sceneparsing.csail.mit.edu/). - **COCO Stuff:** For COCO, there is two partitions, CocoStuff10k with only 10k that are used for training the evaluation, note that this dataset is outdated, can be used for small scale testing and training, and can be downloaded [here](https://github.com/nightrome/cocostuff10k). For the official dataset with all of the training 164k examples, it can be downloaded from the official [website](http://cocodataset.org/#download).\ Note that when using COCO dataset, 164k version is used per default, if 10k is prefered, this needs to be specified with an additionnal parameter `partition = 'CocoStuff164k'` in the config file with the corresponding path. - **DeepScene:** For DeepScene's Freiburg forest dataset, simply download the images and their annotations for training and validation from [http://deepscene.cs.uni-freiburg.de](http://deepscene.cs.uni-freiburg.de/static/datasets/download_freiburg_forest_annotated.sh). Note that the link downloads a script (.sh) file which will manage the downloads for the rest of the dataset. ### Losses In addition to the Cross-Entorpy loss, there is also - **Dice-Loss**, which measures of overlap between two samples and can be more reflective of the training objective (maximizing the mIoU), but is highly non-convexe and can be hard to optimize. - **CE Dice loss**, the sum of the Dice loss and CE, CE gives smooth optimization while Dice loss is a good indicator of the quality of the segmentation results. - **Focal Loss**, an alternative version of the CE, used to avoid class imbalance where the confident predictions are scaled down. - **Lovasz Softmax** lends it self as a good alternative to the Dice loss, where we can directly optimization for the mean intersection-over-union based on the convex Lovász extension of submodular losses (for more details, check the paper: [The Lovász-Softmax loss](https://arxiv.org/abs/1705.08790)). ### Learning rate schedulers - **Poly learning rate**, where the learning rate is scaled down linearly from the starting value down to zero during training. Considered as the go to scheduler for semantic segmentaion (see Figure below). - **One Cycle learning rate**, for a learning rate LR, we start from LR / 10 up to LR for 30% of the training time, and we scale down to LR / 25 for remaining time, the scaling is done in a cos annealing fashion (see Figure bellow), the momentum is also modified but in the opposite manner starting from 0.95 down to 0.85 and up to 0.95, for more detail see the paper: [Super-Convergence](https://arxiv.org/abs/1708.07120). <img src="images/learning_rates.png" align="center" width="750"> ### Data augmentation All of the data augmentations are implemented using OpenCV in `\base\base_dataset.py`, which are: rotation (between -10 and 10 degrees), random croping between 0.5 and 2 of the selected `crop_size`, random h-flip and blurring ## Training To train a model, first download the dataset to be used to train the model, then choose the desired architecture, add the correct path to the dataset and set the desired hyperparameters (the config file is detailed below), then simply run: ```bash python train.py --config config.json ``` The training will automatically be run on the GPUs (if more that one is detected and multipple GPUs were selected in the config file, `torch.nn.DataParalled` is used for multi-gpu training), if not the CPU is used. The log files will be saved in `saved\runs` and the `.pth` chekpoints in `saved\`, to monitor the training using tensorboard, please run: ```bash tensorboard --logdir saved ``` <img src="images/tb1.png" align="center" width="900"> <img src="images/tb2.png" align="center" width="900"> ## Inference For inference, we need a PyTorch trained model, the images we'd like to segment and the config used in training (to load the correct model and other parameters), ```bash python inference.py --config config.json --model best_model.pth --images images_folder ``` The predictions will be saved as `.png` images using the default palette in the passed fodler name, if not, `outputs\` is used, for Pacal VOC the default palette is: <img src="images/colour_scheme.png" align="center" width="550"> Here are the parameters availble for inference: ``` --output The folder where the results will be saved (default: outputs). --extension The extension of the images to segment (default: jpg). --images Folder containing the images to segment. --model Path to the trained model. --mode Mode to be used, choose either `multiscale` or `sliding` for inference (multiscale is the default behaviour). --config The config file used for training the model. ``` **Trained Model:** | Model | Backbone | PascalVoc val mIoU | PascalVoc test mIoU | Pretrained Model | | :-------- | :----------: |:-----------------: |:-------------------:|:----------------:| | PSPNet | ResNet 50 | 82% | 79% | [Dropbox](https://www.dropbox.com/s/7rni1tzyjmdkiup/PSPnet.pth?dl=0) | ## Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yassouali/pytorch-segmentation/blob/master/tutorial.ipynb) The repository provide a colab notebook to download the cityscapes dataset and run the library. ## Code structure The code structure is based on [pytorch-template](https://github.com/victoresque/pytorch-template/blob/master/README.md) ``` pytorch-template/ │ ├── train.py - main script to start training ├── inference.py - inference using a trained model ├── trainer.py - the main trained ├── config.json - holds configuration for training │ ├── base/ - abstract base classes │ ├── base_data_loader.py │ ├── base_model.py │ ├── base_dataset.py - All the data augmentations are implemented here │ └── base_trainer.py │ ├── dataloader/ - loading the data for different segmentation datasets │ ├── models/ - contains semantic segmentation models │ ├── saved/ │ ├── runs/ - trained models are saved here │ └── log/ - default logdir for tensorboard and logging output │ └── utils/ - small utility functions ├── losses.py - losses used in training the model ├── metrics.py - evaluation metrics used └── lr_scheduler - learning rate schedulers ``` ## Config file format Config files are in `.json` format: ```javascript { "name": "PSPNet", // training session name "n_gpu": 1, // number of GPUs to use for training. "use_synch_bn": true, // Using Synchronized batchnorm (for multi-GPU usage) "arch": { "type": "PSPNet", // name of model architecture to train "args": { "backbone": "resnet50", // encoder type type "freeze_bn": false, // When fine tuning the model this can be used "freeze_backbone": false // In this case only the decoder is trained } }, "train_loader": { "type": "VOC", // Selecting data loader "args":{ "data_dir": "data/", // dataset path "batch_size": 32, // batch size "augment": true, // Use data augmentation "crop_size": 380, // Size of the random crop after rescaling "shuffle": true, "base_size": 400, // The image is resized to base_size, then randomly croped "scale": true, // Random rescaling between 0.5 and 2 before croping "flip": true, // Random H-FLip "rotate": true, // Random rotation between 10 and -10 degrees "blur": true, // Adding a slight amount of blut to the image "split": "train_aug", // Split to use, depend of the dataset "num_workers": 8 } }, "val_loader": { // Same for val, but no data augmentation, only a center crop "type": "VOC", "args":{ "data_dir": "data/", "batch_size": 32, "crop_size": 480, "val": true, "split": "val", "num_workers": 4 } }, "optimizer": { "type": "SGD", "differential_lr": true, // Using lr/10 for the backbone, and lr for the rest "args":{ "lr": 0.01, // Learning rate "weight_decay": 1e-4, // Weight decay "momentum": 0.9 } }, "loss": "CrossEntropyLoss2d", // Loss (see utils/losses.py) "ignore_index": 255, // Class to ignore (must be set to -1 for ADE20K) dataset "lr_scheduler": { "type": "Poly", // Learning rate scheduler (Poly or OneCycle) "args": {} }, "trainer": { "epochs": 80, // Number of training epochs "save_dir": "saved/", // Checkpoints are saved in save_dir/models/ "save_period": 10, // Saving chechpoint each 10 epochs "monitor": "max Mean_IoU", // Mode and metric for model performance "early_stop": 10, // Number of epochs to wait before early stoping (0 to disable) "tensorboard": true, // Enable tensorboard visualization "log_dir": "saved/runs", "log_per_iter": 20, "val": true, "val_per_epochs": 5 // Run validation each 5 epochs } } ``` ## Acknowledgement - [PyTorch-Encoding](https://github.com/zhanghang1989/PyTorch-Encoding) - [Pytorch-Template](https://github.com/victoresque/pytorch-template/blob/master/README.m) - [Synchronized-BatchNorm-PyTorch](https://github.com/vacancy/Synchronized-BatchNorm-PyTorch)

ML Frameworks

1.8K Github Stars