- Synthetic Data for Text Localisation in Natural Images
- Synthesizing Training Data for Object Detection in Indoor Scenes
- Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection(Randomly crop paste)
- InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting
- Modeling Visual Context is Key to Augmenting Object Detection Datasets
- Context-Aware Synthesis and Placement of Object Instances
- Data Augmentation for Object Detection via Progressive and Selective Instance-Switching
-
Dataset Distilation
-
Soft-Label Dataset Distillation and Text Dataset Distillation(Waterloo 2019)
Apply probability distribution instead hard lable
- Core Vector Machines:Fast SVM Training on Very Large Data Sets
- Smaller Coresets for k-Median and k-Means Clustering
- Using document summarization techniques for speech data subset selection
- Submodular subset selection for large-scale speech training data
- Learning mixtures of submodular functions for image collection summarization
- Unsupervised data selection and word-morph mixed language model for tamil low-resource keyword search
-
An empirical study of example forgetting during deep neural network learning
Examples not forgot may be the "core examples"
- Selection via Proxy: Efficient Data Selection for Deep Learning (Stanford 2019)
- Burr Settles. Active learning
-
Dropout as a bayesian approximation: Representing model uncertainty in deep learning
Relation between deep learning and bayesian approximation
-
Deep bayesian active learning with image data
Apply active learning in deep bayesian learning
-
BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning
-
Active learning for convolutional neural networks: A core-set approach. (Stanford 2018)
Also introduce a core-set selection method for CNN