### Gradient Descent

If you are familiar with maths, you might know that minimum or maximum value of a function is got by equating its derivative to zero. In other words, the value obtained when the slope(m) is zero considered as the min or max of that line(function).

Gradient descent is one approach to find the minimum value of cost function. The algorithm goes through the θ values step by step, in direction of negative slopes and stops at slope zero.

Gradient descent is one approach to find the minimum value of cost function. The algorithm goes through the θ values step by step, in direction of negative slopes and stops at slope zero.

The above image from hackernoon explains this concept well. (Please note, the parameter representation here is w not θ).

Gradient descent algorithm is given by,

θj :=θj −α∂θj ∂ J(θ0 ,θ1 )

α - Learning rate (Length of each step)

J(θ₀,θ₁) - Cost function

j - Iterator

The value of α plays a key role in determining the gradient descent. A smaller value of α will reduce the speed of computation. On the other hand, a larger value of α may skip the converging point(the situation is called as overshooting).

The screenshot from Andrew NG's course explains both issues perfectly.

In the equation,

θj:=θj−α∂θj∂J(θ0,θ1)

the derivative part or slope will reduce and reaches zero at the minimum value. i.e,

θj:=θj−α * 0 =>

𝜃 = 𝜃𝚥 => Minimum value.
## Comments

## Post a Comment