Machine learning gives computers the ability to learn without being explicitly programmed.
Linear regression is a simple model to find best fit for some data.
Example: You have a data set with commute time, sleep time, salary and happiness (1-10). For the three features - commute time, sleep time and salary - you want to predict happiness level.
How well does this line predict the data? Cost function (essentially sum of deltas between a line at a point and the real point) which we want to minimize. Once function converges we know we have a good fit. How do we find it? Gradient descent.