Skip to content

Instantly share code, notes, and snippets.

@denzilc
Created November 1, 2011 21:57
Show Gist options
  • Save denzilc/1332063 to your computer and use it in GitHub Desktop.
Save denzilc/1332063 to your computer and use it in GitHub Desktop.
Gradient Descent for the Machine Learning course at Stanford
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
theta_len = length(theta);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
theta -= (alpha/m) * (X' * (X*theta - y));
% temp_theta = theta;
% for j = 1:theta_len
% value = 0;
%
% for i = 1:m
% value += (X(i,:) * theta- y(i,:)) * X(i,j);
% end
%
% temp_theta(j,:) = temp_theta(j,:) - ((alpha/m)*value);
% end
%
% theta = temp_theta;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
@arefeh1997
Copy link

(alpha/m)*value
is this work for SSE to?

@KrisITB
Copy link

KrisITB commented Mar 2, 2020

why is alpha/m in line 29?

lecture "Gradient Descent For Linear Regression" explains it fairly well.

Nice trick with X(i,j), seems to me that it multiplies by 1 for theta 0 and X(i) for the rest of them... I will still need to scratch my head for a while on this derivative

Re ' it seems to open up citation, maybe for strings markings or something along these lines
according to matlab documentation transpose requires a dot before ' i.e.: A.' rather than A'
(https://uk.mathworks.com/help/matlab/ref/transpose.html)

@Dhruv261098
Copy link

dJ = (X' * (X*theta-y))/m;
theta = theta - alpha * dJ;

@AdrianSKazi
Copy link

Isn't it better to remove num_iter from function arguments and inside the function impute:

num_iter = size(X,2)

Do I misunderstand something?

@barrylee111
Copy link

@adriansaati, num_iter is given for the problem.

@barrylee111
Copy link

Good resource for answering this question: https://www.coursera.org/learn/machine-learning/resources/QQx8l.

@vineetsoni24
Copy link

You may also use the following code:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
%               theta.
%
% Hint: While debugging, it can be useful to print out the values
%       of the cost function (computeCost) and gradient here.
%
theta = theta - ((((X*theta)-y)'*X)'*(alpha/m));








% ============================================================

% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);

end

end

@mahmudtolba
Copy link

theta= theta - alpha*(1/m)*(X'*(X*theta - y))

@mahmudtolba
Copy link

I am new to gradient descent. In the calculation for theta why is there a multiplication by X' ?

that is instead of using sum function

@yashkirti1996
Copy link

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
%               theta. 
%
% Hint: While debugging, it can be useful to print out the values
%       of the cost function (computeCost) and gradient here.
%

theta = theta - ((((X'.*theta)-y)'*X)'*(alpha/m));


% ============================================================

% Save the cost J in every iteration    
J_history(iter) = computeCost(X, y, theta);

end

end

its giving me error Unable to perform assignment because the left and right sides have a different number of elements.

Error in [gradientDescent](matlab:matlab.internal.language.introspective.errorDocCallback('gradientDescent', 'C:\Users\yashk\Downloads\ML\ex1-ex8-matlab\ex1-ex8-matlab\ex1\gradientDescent.m', 26)) ([line 26](matlab: opentoline('C:\Users\yashk\Downloads\ML\ex1-ex8-matlab\ex1-ex8-matlab\ex1\gradientDescent.m',26,0)))
J_history(iter) = computeCost(X, y, theta);

@AnkitaBh
Copy link

When I try to implement my version, my theta ends up as 0; has anyone else had that issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment