An article to understand integrated learning

In machine learning, we talked about a lot of different algorithms. Those algorithms are heroes who fight alone. Integrated learning is to form these heroes into a team. Realize the effect of "3 smudges top Zhuge Liang".

This article will introduce the main ideas of 2 for integrated learning: bagging and boosting.

 

What is integrated learning?

Ensemble learning belongs to machine learning. It is a "training idea", not a specific method or algorithm.

In real life, everyone knows that "many people have great power" and "three dreadful tanners put Zhuge Liang together." The core idea of ​​ensemble learning is "many people have more power". It does not create new algorithms, but combines existing algorithms to get better results.

Integrated learning picks up some simple basic models for assembly. The main ideas for assembling these basic models are 2 methods:

  1. Bagging (short for bootstrap aggregating, also known as "bagging method")
  2. boosting

 

Bagging

Bagging core ideas

The core idea of ​​Bagging is - democracy.

The idea of ​​Bagging is that all the basic models are treated consistently, and each basic model has only one vote in hand. Then use the democratic vote to get the final result.

most of the time,The resulting variance is smaller after bagging.

Bagging specific process

Specific process:

  1. The training set is extracted from the original sample set. Each round draws n training samples from the original sample set using Bootstraping (in the training set, some samples may be extracted multiple times, and some samples may not be drawn at once). A total of k rounds of extraction are performed to obtain k training sets. (k training sets are independent of each other)
  2. Each time a training set is used to get a model, k training sets get a total of k models. (Note: There is no specific classification algorithm or regression method here, we can use different classification or regression methods according to specific problems, such as decision trees, perceptrons, etc.)
  3. For the classification problem: the k models obtained in the previous step are classified by voting method; for the regression problem, the mean of the above model is calculated as the final result. (All models are of equal importance)

Example:

Among the methods of bagging, the most widely known is the random forest: bagging + decision tree = random forest

'A text to understand the decision tree (3 steps + 3 typical algorithm + 10 advantages and disadvantages)"

'Read a random forest (4 steps + 4 method evaluation + 10 advantages and disadvantages)"

 

Boosting

Boosting core ideas

The core idea of ​​Boosting is to pick the elite.

The most essential difference between Boosting and bagging is that he does not treat the basic model uniformly, but selects the "elite" after constant testing and screening, and then gives the elite more voting rights. Fewer voting rights, then the votes of all people are combined to get the final result.

most of the time,The result bias (bias) obtained by boosting is smaller.

Boosting specific process

Specific process:

  1. The basic model is linearly combined by the addition model.
  2. Each round of training improves the weight of the base model with a small error rate and reduces the weight of the model with a high error rate.
  3. Change the weight or probability distribution of the training data in each round, and reduce the weight of the sample by the weak classifier in the previous round, and reduce the weight of the previous round of the paired sample to make the classifier It has a good effect on misclassified data.

Example:

In the method of boosting, there are more mainstream ones. Adaboost And Gradient boosting.

'I understand Adaboost and its advantages and disadvantages."

 

Differences in 4 points between Bagging and Boosting

Differences in 4 points between Bagging and Boosting

Sample selection:

Bagging: The training set is returned in the original set, and the training sets selected from the original set are independent.

Boosting: The training set for each round is unchanged, except that the weight of each sample in the classifier in the training set changes. The weight is adjusted according to the classification result of the previous round.

Sample weights:

Bagging: Use uniform sampling, each sample has the same weight

Boosting: Constantly adjust the weight of the sample according to the error rate. The larger the error rate, the greater the weight.

Prediction function:

Bagging: The weights of all prediction functions are equal.

Boosting: Each weak classifier has a corresponding weight, which will have greater weight for classifiers with small classification errors.

Parallel Computing:

Bagging: Each prediction function can be generated in parallel

Boosting: Each prediction function can only be generated sequentially, because the latter model parameter requires the results of the previous round of models.

The difference is transferred fromBagging and Boosting concepts and differences"

 

Baidu Encyclopedia and Wikipedia

Baidu Encyclopedia version

Integrated learning is a machine learning method that uses a series of learners to learn, and uses certain rules to integrate various learning results to obtain a better learning effect than a single learner. In general, multiple learners in integrated learning are homogeneous "weak learners".

Read More

Wikipedia version

In statistics and machine learning, the collection method uses a variety of learning algorithms to achieve better predictive performance than is obtained from any constituent learning algorithm alone. Unlike statistical sets (usually infinite) in statistical mechanics, machine learning sets consist of only a specific set of finite replacement models, but generally allow for a more flexible structure in these alternative models.

Read More

【practice】The most common method of 5 integrated learning