Created
August 25, 2018 04:40
-
-
Save sagarmainkar/41d135a04d7d3bc4098f0664fe20cf3c to your computer and use it in GitHub Desktop.
Hi, this is fantastic material; thanks so much.
I think there is a typo in equation (8). Shouldn't the X's subindex be j? Meaning X_j instead of X_0?
Regards,
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I was just about to make the same observation as sivi299 regarding the cost function.
In this case, since m is fixed from iteration to iteration when doing the gradient descent, I don't think it matters when it comes to optimizing the theta variable. As written, it's proportional to the mean-squared error, but it should optimize towards the same theta all the same.
The relative magnitudes of the cost function history curves differ between the gradient_descent and minibatch_gradient_descent due to different batch sizes when the cal_cost function is called, but since each algorithm uses the same number of points from iteration to iteration internally it should be OK.