## What is the gradient descent method? The formula for the gradient descent algorithm is very simple,"In the opposite direction of the gradient (slope steepest)"What is the essence of our daily experience? Why is the fastest direction of local decline the negative direction of the gradient? Maybe many of my friends are still not clear. It doesn't matter, then I will come in plain language. The mathematical derivation process of the gradient descent algorithm formula is explained in detail.

Let's climb to the top of the mountain as an example.

Suppose we are located on the mountainside of a mountain, without a map, and do not know how to reach the top of the mountain. So I decided to take one step, that is, to go further in the direction of the steepest and most easy to go up the current position, and then continue to take a small step in the steepest direction of the next position. This step by step, go all the way to think that we have reached the top of the mountain. Here the direction of the mountain through the steepest path is the gradient.

## Baidu Encyclopedia

Gradient descent is a first-order optimization algorithm, also known as the steepest descent method. To find the local minimum of a function using the gradient descent method, iterative search must be performed to the specified step distance point in the opposite direction of the current point corresponding to the gradient (or approximate gradient). If the search is iteratively reversed in the positive direction of the gradient, it will approach the local maximum point of the function; this process is called the gradient ascent method.

## Wikipedia

Gradient descent is a first-order iterative optimization algorithm used to find the minimum value of a function. In order to find the local minimum of the function using the gradient descent, a step size proportional to the negative of the gradient (or approximation gradient) of the function at the current point is required. Conversely, if a step size proportional to the positive value of the gradient is used, the local maximum of the function is approached; then the process is referred to as a gradient rise.

The gradient drop is also known as the steepest drop. However, the gradient descent should not be confused with the steepest descent method of steepest descent.