Gradient descent Basically finding optimal weight Find optimal weight How to find best weight that minimizes loss? --> Lowest point in the graph --> Gradient is zero --> Stepping Batch VS Epoch