Skip to content

Instantly share code, notes, and snippets.

@fannix
Created December 10, 2012 06:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fannix/4248755 to your computer and use it in GitHub Desktop.
Save fannix/4248755 to your computer and use it in GitHub Desktop.
Unbalanced dataset classification and visualization
import pylab as pl
import sklearn
from sklearn import linear_model, svm
import numpy as np
from sklearn import datasets
X, y = datasets.make_classification(n_samples=100, n_features=2, n_redundant=0)
pl.scatter(X[:, 0], X[:, 1], c=y)
clr0 = linear_model.LogisticRegression()
clr0.fit(X, y)
clr0.predict(X).sum()
w = clr0.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - clr0.intercept_[0] / w[1]
pl.plot(xx, yy, 'k--', label='no weights')
clr1 = linear_model.LogisticRegression(class_weight={0: 0.9, 1: 0.1})
clr1.fit(X, y)
print clr1.predict(X).sum()
w1 = clr1.coef_[0]
a1 = -w1[0] / w1[1]
xx1 = np.linspace(-5, 5)
yy1 = a * xx1 - clr1.intercept_[0] / w1[1]
pl.plot(xx1, yy1, 'k-', label='with weights')
clr2 = svm.SVC(kernel="linear", class_weight={0: 0.9, 1: 0.1})
clr2.fit(X, y)
print clr2.predict(X).sum()
w2 = clr2.coef_[0]
a2 = -w2[0] / w2[1]
xx2 = np.linspace(-5, 5)
yy2 = a * xx2 - clr2.intercept_[0] / w2[1]
pl.plot(xx2, yy2, 'k-.', label='SVM with weights')
pl.legend()
pl.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment