Instantly share code, notes, and snippets.

# marcelcaraciolo/log_regression.py

Last active August 7, 2019 18:51
Show Gist options
• Save marcelcaraciolo/1365841 to your computer and use it in GitHub Desktop.
Logistic prediction
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 def sigmoid(X): '''Compute the sigmoid function ''' #d = zeros(shape=(X.shape)) den = 1.0 + e ** (-1.0 * X) d = 1.0 / den return d def compute_cost(theta,X,y): #computes cost given predicted and actual values m = X.shape[0] #number of training examples theta = reshape(theta,(len(theta),1)) #y = reshape(y,(len(y),1)) J = (1./m) * (-transpose(y).dot(log(sigmoid(X.dot(theta)))) - transpose(1-y).dot(log(1-sigmoid(X.dot(theta))))) grad = transpose((1./m)*transpose(sigmoid(X.dot(theta)) - y).dot(X)) #optimize.fmin expects a single value, so cannot return grad return J[0][0]#,grad def compute_grad(theta, X, y): #print theta.shape theta.shape = (1, 3) grad = zeros(3) h = sigmoid(X.dot(theta.T)) delta = h - y l = grad.size for i in range(l): sumdelta = delta.T.dot(X[:, i]) grad[i] = (1.0 / m) * sumdelta * - 1 theta.shape = (3,) return grad

### waylonflinn commented Nov 27, 2013

Here are some slightly simplified versions. I modified `grad` to be slightly more vectorized. I also took out the negatives in the cost function and gradient.

```def sigmoid(X):
return 1 / (1 + numpy.exp(- X))

def cost(theta, X, y):
p_1 = sigmoid(numpy.dot(X, theta)) # predicted probability of label 1
log_l = (-y)*numpy.log(p_1) - (1-y)*numpy.log(1-p_1) # log-likelihood vector

return log_l.mean()

p_1 = sigmoid(numpy.dot(X, theta))
error = p_1 - y # difference between label and prediction

Here's how I ran them:

```import scipy.optimize as opt

# prefix an extra column of ones to the feature matrix (for intercept term)
theta = 0.1* numpy.random.randn(3)
X_1 = numpy.append( numpy.ones((X.shape[0], 1)), X, axis=1)

theta_1 = opt.fmin_bfgs(cost, theta, fprime=grad, args=(X_1, y))```

Some initial values of `theta` cause it to fail to converge. Just run it again.

### marcelcaraciolo commented Dec 13, 2013

Hi @waylonflinn I solved the problem updating the cost function.

### cipri-tom commented Oct 29, 2016

hi all,

why does the gradient have to be scaled by `y.size` ?
Thank you!

### graffaner commented Nov 7, 2016 • edited

@cipri-tom scaled by `y.size` to calculate the average since summed over all training data I think

### vinipachecov commented Feb 16, 2017 • edited

I think your code is not converging to the minimum. As I tested is not converging at all. Check this tutorial code that I'll post below here and I think it will make sense. Your full code, that is not this one is converging to the initial parameter vector, which is (0,0,0). Try using fmin_tnc instead of fmin_bfgs.
Btw, excellent work.

http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-3/