This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
|'''Compute the sigmoid function '''|
|#d = zeros(shape=(X.shape))|
|den = 1.0 + e ** (-1.0 * X)|
|d = 1.0 / den|
|def compute_cost(theta,X,y): #computes cost given predicted and actual values|
|m = X.shape #number of training examples|
|theta = reshape(theta,(len(theta),1))|
|#y = reshape(y,(len(y),1))|
|J = (1./m) * (-transpose(y).dot(log(sigmoid(X.dot(theta)))) - transpose(1-y).dot(log(1-sigmoid(X.dot(theta)))))|
|grad = transpose((1./m)*transpose(sigmoid(X.dot(theta)) - y).dot(X))|
|#optimize.fmin expects a single value, so cannot return grad|
|def compute_grad(theta, X, y):|
|theta.shape = (1, 3)|
|grad = zeros(3)|
|h = sigmoid(X.dot(theta.T))|
|delta = h - y|
|l = grad.size|
|for i in range(l):|
|sumdelta = delta.T.dot(X[:, i])|
|grad[i] = (1.0 / m) * sumdelta * - 1|
|theta.shape = (3,)|
Dec 13, 2013
Hi @waylonflinn I solved the problem updating the cost function.
Oct 29, 2016
why does the gradient have to be scaled by
Nov 7, 2016
@cipri-tom scaled by
y.size to calculate the average since summed over all training data I think
Feb 16, 2017
I think your code is not converging to the minimum. As I tested is not converging at all. Check this tutorial code that I'll post below here and I think it will make sense. Your full code, that is not this one is converging to the initial parameter vector, which is (0,0,0). Try using fmin_tnc instead of fmin_bfgs.
Btw, excellent work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Here are some slightly simplified versions. I modified
gradto be slightly more vectorized. I also took out the negatives in the cost function and gradient.
Here's how I ran them:
Some initial values of
thetacause it to fail to converge. Just run it again.