In machine learning, we talked about a lot of different algorithms. Those algorithms are heroes who fight alone. Integrated learning is to form these heroes into a team. Realize the effect of "3 smudges top Zhuge Liang".

This article will introduce the main ideas of 2 for integrated learning: bagging and boosting.

## What is integrated learning?

Integrated learning belongs to machine learning. He is a "training idea" and is not a specific method or algorithm.

In real life, everyone knows that "people are more powerful," and "3 stinkers are the top ones." The core idea of ​​integrated learning is that "people are more powerful." Instead of creating new algorithms, they combine existing algorithms to achieve better results.

Integrated learning picks up some simple basic models for assembly. The main ideas for assembling these basic models are 2 methods:

1. Bagging (short for bootstrap aggregating, also known as "bagging method")
2. boosting

## Bagging

The core idea of ​​Bagging is - democracy.

The idea of ​​Bagging is that all the basic models are treated consistently, and each basic model has only one vote in hand. Then use the democratic vote to get the final result.

most of the time,The resulting variance is smaller after bagging.

Specific process:

1. The training set is extracted from the original sample set. Each round draws n training samples from the original sample set using Bootstraping (in the training set, some samples may be extracted multiple times, and some samples may not be drawn at once). A total of k rounds of extraction are performed to obtain k training sets. (k training sets are independent of each other)
2. Each time a training set is used to get a model, k training sets get a total of k models. (Note: There is no specific classification algorithm or regression method here, we can use different classification or regression methods according to specific problems, such as decision trees, perceptrons, etc.)
3. For the classification problem: the k models obtained in the previous step are classified by voting method; for the regression problem, the mean of the above model is calculated as the final result. (All models are of equal importance)

Example:

Among the methods of bagging, the most widely known is the random forest: bagging + decision tree = random forest

## Boosting

The core idea of ​​Boosting is to pick the elite.

The most essential difference between Boosting and bagging is that he does not treat the basic model consistently. Instead, he tries to select the "elite" through constant testing and screening, and then gives the elite more voting rights. The poorly performing basic model gives Less voting rights, then combined with everyone's vote to get the final result.

most of the time,The result bias (bias) obtained by boosting is smaller.

Specific process:

1. The basic model is linearly combined by the addition model.
2. Each round of training improves the weight of the base model with a small error rate and reduces the weight of the model with a high error rate.
3. Change the weight or probability distribution of the training data in each round, and reduce the weight of the sample by the weak classifier in the previous round, and reduce the weight of the previous round of the paired sample to make the classifier It has a good effect on misclassified data.

Example:

In the method of boosting, there are more mainstream ones. Adaboost And Gradient boosting.

## Differences in 4 points between Bagging and Boosting

Sample selection:

Bagging: The training set is returned in the original set, and the training sets selected from the original set are independent.

Boosting: The training set for each round is unchanged, except that the weight of each sample in the classifier in the training set changes. The weight is adjusted according to the classification result of the previous round.

Sample weights:

Bagging: Use uniform sampling, each sample has the same weight

Boosting: Constantly adjust the weight of the sample according to the error rate. The larger the error rate, the greater the weight.

Prediction function:

Bagging: The weights of all prediction functions are equal.

Boosting: Each weak classifier has a corresponding weight, which will have greater weight for classifiers with small classification errors.

Parallel Computing:

Bagging: Each prediction function can be generated in parallel

Boosting: Each prediction function can only be generated sequentially, because the latter model parameter requires the results of the previous round of models.

The difference is transferred fromBagging and Boosting concepts and differences"

## Baidu Encyclopedia and Wikipedia

Baidu Encyclopedia version

Integrated learning is a machine learning method that uses a series of learners to learn, and uses certain rules to integrate various learning results to obtain a better learning effect than a single learner. In general, multiple learners in integrated learning are homogeneous "weak learners".