Skip to content

Instantly share code, notes, and snippets.

@sagarmainkar
Created August 25, 2018 04:40
Show Gist options
  • Save sagarmainkar/41d135a04d7d3bc4098f0664fe20cf3c to your computer and use it in GitHub Desktop.
Save sagarmainkar/41d135a04d7d3bc4098f0664fe20cf3c to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@JayRenData
Copy link

I've output 'theta_history', 'cost_history' and 'x', 'y', but can not get the same 'cost' as the output by the program above

@betidav
Copy link

betidav commented Dec 7, 2018

@JayRenData
I think the reason you can't get it is because of the integer divisions in the code.
if you replace m = len(y) by float(len(y)) in line 8
and do the same in line 9 the everything should work fine.

@Rouxkein
Copy link

I need your help: f(x)=x*x+2x+5
Generate some data with: theta0=5,theta1=2,theta3=1
i don't know how to make it !

@sagarmainkar
Copy link
Author

I need your help: f(x)=x*x+2x+5
Generate some data with: theta0=5,theta1=2,theta3=1
i don't know how to make it !

It is quite simple refer to this in above code
X = 2 * np.random.rand(100,1)
y = 4 +3 * X+np.random.randn(100,1)

So you can do
X = can be any number randomly generated ( np.random.rand(100,1))
Y = 5+ XX + 2X

There is no need for theta3

@bhargava1996
Copy link

The result is not matching with the initial values in the mini batch gradient descent.
You gave 4,3 but the values coming out are 3.6,3 for you. So i think this is the best local gradient descent you arrived.

But I arrived at 4.0 and 2.9 with your code(psted and tested) and my code as well.

Despite these things I learnt many things here.
Thank you very much.

@IvonneEO
Copy link

IvonneEO commented Mar 2, 2021

I need your help: f(x,y) = xx + yy

@sivi299
Copy link

sivi299 commented Aug 3, 2021

Hi,
There seems to be a flaw in the cost function

cost = (1/2*m) * np.sum(np.square(predictions-y))

Shouldn't it be

cost = 1/(2*m) * np.sum(np.square(predictions-y))

Nice walkthrough

@kenwyee
Copy link

kenwyee commented Aug 4, 2021

I was just about to make the same observation as sivi299 regarding the cost function.
In this case, since m is fixed from iteration to iteration when doing the gradient descent, I don't think it matters when it comes to optimizing the theta variable. As written, it's proportional to the mean-squared error, but it should optimize towards the same theta all the same.

The relative magnitudes of the cost function history curves differ between the gradient_descent and minibatch_gradient_descent due to different batch sizes when the cal_cost function is called, but since each algorithm uses the same number of points from iteration to iteration internally it should be OK.

@paocarvajal1912
Copy link

Hi, this is fantastic material; thanks so much.
I think there is a typo in equation (8). Shouldn't the X's subindex be j? Meaning X_j instead of X_0?
Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment