The rapid development of deep learning models in recent years is inseparable from the huge data volume and diversified data collection. Collecting large amounts of rich data is a time-consuming and labor-intensive task, and data enhancement provides researchers with another possibility to increase data diversity, without having to collect data to get a richer variety of training data. Researchers from Berkeley put forwardPBA (Population Based Augmentation)using that method toGet more effective data enhancement strategiesAnd achieved the same effect1000x acceleration.

Data enhancement

Data enhancement strategies usually include clipping, filling, flipping, and rotation. However, these basic strategies are too simple for deep network training. There are still few studies on data enhancement strategies and types compared to neural networks.

Some common data enhancement methods
Some common data enhancement methods

Recently, Google conducted in-depth exploratory research on this aspect, proposed the AutoAugment method and achieved good results in the CIFAR-10 dataset.

This paper uses methods such as reinforcement learning to search for better data enhancement strategies based onRNNThe controller predicts the enhanced strategy from the search space, while a fixed-architecture sub-network is used to converge the training on the enhanced data to the accuracy R. Finally, the precision R is used as a reward to enable the controller to seek a better data enhancement strategy. .


AutoAugment introduces 16 geometry, color transformation and chooses two to enhance each batch of data with a fixed amplitude, so high-performance enhancement methods can be learned directly from the data through reinforcement learning.

But the downside of this approach is that it requires training 15,000 models to converge to gather enough samples for the reinforcement learning model to learn the data enhancement strategy. The calculations between samples cannot be shared, so that it takes 15000 P100 calculations to achieve better results on ImageNet, even on smaller CIFAR-10. 5000 is consumed.GPUTime (this means that 7500-37500 dollars in training costs are required to get a better data enhancement strategy).If the previously trained strategy can be migrated or shared into the new training, the search and acquisition of the data enhancement strategy can be more efficiently implemented.

PBA algorithm

In order to improve the efficiency of the algorithm, researchers from Berkeley proposed the PBA algorithm, which can be compared to the original algorithm.Less than three orders of magnitudeThe same test accuracy is obtained on the calculation.

Unlike AutoAugment, this method isa copy of multiple small modelsTrain CIFAR-10 datasets on Titan XP onlyTrain 5 hoursA better data enhancement strategy can be obtained. This strategy can be applied to CIFAR-100 and retraining a larger network can achieve very effective results. This method takes less time and gets better results than training that previously took many days.

Compared with AutoAugment, the data enhancement strategy given by the new method is performed on different models.
Compared with AutoAugment, the data enhancement strategy given by the new method is performed on different models.
Compared with AutoAugment, the new method gives the data enhancement strategy performance on different models.

Researchers borrowed some ideas from DeepMind's Population Based Training algorithm and applied it to the generation of data enhancement strategies. The current results in training were used as the basis for the generation strategy, so that the training results could be in different submodels. Share and avoid time-consuming repetitive training.

This improvement allows existing workstations to train large data enhancement strategy algorithms as well. Unlike AutoAugment, this method generates a policy scheduling method rather than a fixed strategy. This means that in a training cycle, the data enhancement strategy generated by PBA isf(x,t),among themxIs the input image,For the current training cycle. AutoAugment generates fixed strategies on different submodels.Fi(x).

The researchers used 16 small WideResNet, each of which learns different hyperparameter plans, and the best performing schedules will be used to train large models and derive the final test error rate.

The Population Based Training method first uses a series of small models to find hyperparameters, and then combines the best performing model weights with random searches (explore)

The Population Based Training method first uses a series of small models to discover hyperparameters, and then combines the best performing model weights with random searches.

These small models are first trained from scratch on the target dataset, and then reuse the high-performance hyperparameters onto the underperforming model to achieve reuse of the training process, and then use the perturbation of hyperparameters to achieve random exploration to obtain Better performance.

In this way, researchers can share calculations between different models and share different target hyperparameters from different training phases. By this means, the PBA algorithm avoids the lengthy process of training thousands of models to obtain a high-performance data enhancement strategy.

The following figure shows the data enhancement strategy that researchers have obtained:


The researchers also provide source code and use cases. If you want to learn the appropriate data enhancement strategy for your data set, you can do it under the TUNE framework. You only need to define a new one.Data loaderReady to use. Please refer to the code for details:

https://github.com/arcelien/pba

If you want to know more detailed information, please refer to the paper:
https://arxiv.org/pdf/1905.05393.pdf

This article is transferred from the public number to the door venture,Original address