What is a support vector machine?

Support vector machines are probably one of the most popular and most concerned machine learning algorithms.

A hyperplane is a line that divides the input variable space. In the SVM, the hyperplane is selected to optimally separate the points in the input variable space from their classes (0 level or 1 level). In 2D, you can think of it as a line and assume that all of our input points can be completely separated by this line. The SVM learning algorithm finds the coefficients that cause the hyperplane to best separate the classes.

Support Vector Machines
Support Vector Machines

The distance between the hyperplane and the nearest data point is called the margin. The best or best hyperplane that can separate the two classes is the line with the largest margin. Only these points are related to the construction of the definition hyperplane and classifier. These points are called support vectors. They support or define hyperplanes. In fact, the optimization algorithm is used to find the value of the coefficient that maximizes the margin.

SVM is probably one of the most powerful out-of-the-box classifiers, and it's worth trying to use your data set.


The basic concepts of support vector machines can be explained by a simple example. Let's imagine two categories: red and blue. Our data has two characteristics: x and y. We want a classifier that gives a pair of (x,y) coordinates and the output is limited to red or blue. We will list the marked training data in the following figure:

The support vector opportunity accepts these data points and outputs a hyperplane (in a two-dimensional graph, a line) to separate the two categories. This line is the decision boundary: split red and blue.

But what is the best hyperplane? For the SVM, it is the way to maximize the margins of the two categories. In other words: the hyperplane (in this case, a line) is the farthest distance from the nearest element of each category.

There is a video here (Video addressThe explanation can tell you how the best hyperplane is found.


Advantages and disadvantages of SVM


  • Can solve high-dimensional problems, that is, large feature spaces;
  • Solve the problem of machine learning under small samples;
  • Ability to handle interactions of nonlinear features;
  • No local minimum value problem;(relative to algorithms such as neural networks)
  • No need to rely on the entire data;
  • Generalization ability is relatively strong;


  • When there are many observation samples, the efficiency is not very high;
  • There is no universal solution to nonlinear problems, and sometimes it is difficult to find a suitable kernel function;
  • The interpretation of high-dimensional mappings of kernel functions is not strong, especially radial basis functions;
  • The regular SVM only supports two classifications;
  • Sensitive to missing data;


Baidu Encyclopedia version

Support vector machine Vector Machine, SVM) was first proposed by Corinna Cortes and Vapnik in 1995. It shows many unique advantages in solving small sample, nonlinear and high-dimensional pattern recognition, and can be applied to other machine learning problems such as function fitting. in.

In machine learning, support vector machines (SVMs, which also support vector networks) are supervised learning models related to related learning algorithms that can analyze data, identify patterns, and use for classification and regression analysis.

Read More


Wikipedia version

In machine learning, a support vector machine (SVM) is a supervised learning model with associated learning algorithms that analyzes data for classification and regression analysis. Given a set of training examples, each of which is labeled as belonging to one or the other of the two categories, the SVM training algorithm constructs a model, assigning the new example to one category or another, making it a non-probabilistic binary linear Classifier. The SVM model is to represent the examples as points in space, and the mapping divides the example of the individual categories by the clear gap as wide as possible. The new examples are then mapped to the same space and predicted to belong to a category based on which edge they fall on.

In addition to performing linear classification, SVM can effectively perform nonlinear classification using so-called kernel techniques, implicitly mapping its inputs to high-dimensional feature spaces.

Read More