Gradient descent with 2-point line fit where gradients cross 0
The Gradient_descent method iterates
xnew = xold - rate(t) * grad(xold)
GD is a workhorse in machine learning, because it's so simple,
uses gradients only (not function values), and can do very big
rate(t) is a step-size or "learning rate" (aka η, Greek eta).