-
-
Save denzilc/1332063 to your computer and use it in GitHub Desktop.
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) | |
%GRADIENTDESCENT Performs gradient descent to learn theta | |
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by | |
% taking num_iters gradient steps with learning rate alpha | |
% Initialize some useful values | |
m = length(y); % number of training examples | |
J_history = zeros(num_iters, 1); | |
theta_len = length(theta); | |
for iter = 1:num_iters | |
% ====================== YOUR CODE HERE ====================== | |
% Instructions: Perform a single gradient step on the parameter vector | |
% theta. | |
% | |
% Hint: While debugging, it can be useful to print out the values | |
% of the cost function (computeCost) and gradient here. | |
% | |
theta -= (alpha/m) * (X' * (X*theta - y)); | |
% temp_theta = theta; | |
% for j = 1:theta_len | |
% value = 0; | |
% | |
% for i = 1:m | |
% value += (X(i,:) * theta- y(i,:)) * X(i,j); | |
% end | |
% | |
% temp_theta(j,:) = temp_theta(j,:) - ((alpha/m)*value); | |
% end | |
% | |
% theta = temp_theta; | |
% ============================================================ | |
% Save the cost J in every iteration | |
J_history(iter) = computeCost(X, y, theta); | |
end | |
end |
i would lyk to know the same why there is multiplication by X' and y summation is not used in this one line of code for gradient descent
It's a really elegant solution. Reason X' (X Transpose) is used is because it effectively creates the implicit summation through matrix multiplication. Then "-=" allows you to use subtract from the 2x1 theta matrix in one step.
This is better.
why is alpha/m in line 29?
(alpha/m)*value
is this work for SSE to?
why is alpha/m in line 29?
lecture "Gradient Descent For Linear Regression" explains it fairly well.
Nice trick with X(i,j), seems to me that it multiplies by 1 for theta 0 and X(i) for the rest of them... I will still need to scratch my head for a while on this derivative
Re ' it seems to open up citation, maybe for strings markings or something along these lines
according to matlab documentation transpose requires a dot before ' i.e.: A.' rather than A'
(https://uk.mathworks.com/help/matlab/ref/transpose.html)
dJ = (X' * (X*theta-y))/m;
theta = theta - alpha * dJ;
Isn't it better to remove num_iter from function arguments and inside the function impute:
num_iter = size(X,2)
Do I misunderstand something?
@adriansaati, num_iter is given for the problem.
Good resource for answering this question: https://www.coursera.org/learn/machine-learning/resources/QQx8l.
You may also use the following code:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
theta = theta - ((((X*theta)-y)'*X)'*(alpha/m));
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
theta= theta - alpha*(1/m)*(X'*(X*theta - y))
I am new to gradient descent. In the calculation for theta why is there a multiplication by X' ?
that is instead of using sum function
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
theta = theta - ((((X'.*theta)-y)'*X)'*(alpha/m));
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
its giving me error Unable to perform assignment because the left and right sides have a different number of elements.
Error in [gradientDescent](matlab:matlab.internal.language.introspective.errorDocCallback('gradientDescent', 'C:\Users\yashk\Downloads\ML\ex1-ex8-matlab\ex1-ex8-matlab\ex1\gradientDescent.m', 26)) ([line 26](matlab: opentoline('C:\Users\yashk\Downloads\ML\ex1-ex8-matlab\ex1-ex8-matlab\ex1\gradientDescent.m',26,0)))
J_history(iter) = computeCost(X, y, theta);
When I try to implement my version, my theta ends up as 0; has anyone else had that issue?
I am new to gradient descent. In the calculation for theta why is there a multiplication by X' ?