Skip to content

Instantly share code, notes, and snippets.

@hdemers
Created March 20, 2014 00:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hdemers/9654725 to your computer and use it in GitHub Desktop.
Save hdemers/9654725 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Experiment in assigning feature weights\n",
"=======================================\n",
"\n",
"Here we explore how weights assigned to features of a feature vector affect the outcome of a classification."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"\n",
"import random\n",
"import pandas as pd\n",
"import numpy as np"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's build a fake data set. We'll build 50 feature vectors each having two features."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"l = [float(random.randint(0,1)) for i in range(100)]\n",
"X = np.ndarray((50,2), buffer=np.array(l))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"y = []\n",
"for i, j in X:\n",
" if i == 1 and j == 0:\n",
" r = 1 if random.gauss(0.7,0.3) > 0.5 else 0\n",
" y.append(r)\n",
" elif i == 1 and j == 1:\n",
" y.append(1)\n",
" elif i == 0 and j == 1:\n",
" r = 1 if random.gauss(0.3,0.3) < 0.5 else 0\n",
" y.append(r)\n",
" else:\n",
" y.append(0)\n",
" \n",
"y = np.array(y)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Don't pay too much attention to the above. It's not very elegant and serve only to build different distributions of 0 and 1. The important part is the result below. We have a list of feature vectors, $X$ and a vector of target labels, $y$, the ground truth. Thus, a feature vector [1, 1] having label 1, means that for this vector, both features were observed and the resulting class was observed to be 1."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for (x1, x2), y_ in zip(X, y):\n",
" print \"({}, {}) -> {}\".format(x1, x2, y_)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"(1.0, 0.0) -> 1\n",
"(1.0, 0.0) -> 0\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 1.0) -> 1\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 0.0) -> 1\n",
"(1.0, 0.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 0.0) -> 0\n",
"(1.0, 0.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 0.0) -> 1\n",
"(1.0, 0.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 0.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(0.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 0.0) -> 1\n",
"(0.0, 1.0) -> 1\n",
"(0.0, 1.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(1.0, 0.0) -> 1\n",
"(1.0, 1.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(1.0, 0.0) -> 1\n",
"(0.0, 0.0) -> 0\n",
"(0.0, 0.0) -> 0\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's plot what we have here. We'll assign the color blue to 0 and red to 1."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import pylab as pl\n",
"pl.scatter(X[:, 0], X[:, 1], c=y)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"<matplotlib.collections.PathCollection at 0x463c510>"
]
},
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAECCAYAAAD3vwBsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGZRJREFUeJzt3X9wVOW9x/FPbnIFAoS4a2OE3AINMDLMWBZCYQheKDTe\nufVqw51pR6EzUmTkx4hJrFhCTIVi6rQFyigwID8mjtW5bewkQ2fqtXsLViLe6hbWCYKEVMpIIcSw\nhJDwow177h+9zZAmQDbPkmcf9v36i5N9sucDyTlfz/e755jieZ4nAEDS+SfbAQAAdlAAACBJUQAA\nIElRAAAgSVEAACBJUQAAIEmlmb7Bli1bdPDgQWVkZGj9+vXdXt+3b592794tz/M0aNAgLVq0SCNH\njjTdLQDAUIrpfQBHjhzRwIEDtWnTph4LQH19vXJycpSenq5wOKyqqipVVFSY7BIAEAfGLaDx48dr\n8ODB13193LhxSk9PlySNGTNGZ8+eNd0lACAO+nUGsGfPHgUCgf7cJQDgOvqtABw6dEh79+7V/Pnz\n+2uXAIAbMB4C98aJEye0bds2lZWVaciQIddd95vf/Eapqan9EQkAbhuZmZmaPHlyzN93ywtAc3Oz\n1q1bp+XLlys7O/uGa1NTUzVp0qRbHQkAbisHDhzo0/cZt4A2btyo8vJynTp1SkuXLtWePXsUDAYV\nDAYlSW+++aba29u1Y8cOPfvssyotLTXdZcKqra21HcEI+e0ivz0uZzdhfAVQXFx8w9eXLFmiJUuW\nmO4GABBnxvcBxNNvf/tbWkAAEKMDBw5ozpw5MX8fj4IAgCRFAYgj1/uI5LeL/Pa4nN0EBQAAkhQz\nAABwHDMAAEBMKABx5Hofkfx2kd8el7OboAAAQJJiBgAAjmMGAACICQUgjlzvI5LfLvLb43J2ExQA\nAEhSzAAAwHHMAAAAMaEAxJHrfUTy20V+e1zOboICAABJihkAADiOGQAAICYUgDhyvY9IfrvIb4/L\n2U1QAAAgSTEDAADHMQMAAMSEAhBHrvcRyW8X+e1xObuJNJNv3rJliw4ePKiMjAytX7++xzW7du1S\nOBzWgAEDtGzZMo0ePdpklwCAODG6AvjqV7+qVatWXff1AwcO6MyZM3rppZf0xBNPaMeOHSa7S3gz\nZsywHcEI+e0ivz0uZzdhdAUwfvx4NTU1Xff1UCikmTNnSpLGjh2r9vZ2tbS0KDMz02S3Caf5yBG1\nNDToYlOT7powQcOnTbMdCUAvnP/sM0U++UQXPvtMd44Zo3/513+1Half3dIZQCQSkd/v79z2+/2K\nRCK3cpdWNH74oX792GN6Z8UKvbVwoU5/8IHtSH3ieh+U/Ha5mP/ckSP61SOP6J1nntGv5s3TZ+++\naztSv7rlQ+BYP2V67S9RbW1twm+3tbV1+aVpb2xU+/9fFSVCvli26+rqEioP+RMr3+2W/6OPPlLz\n4cPS/5+jOi5e1IWTJxMmX6zbfWF8H0BTU5N+9KMf9TgEfuWVVzRhwgTl5+dLkoqLi7V69errtoBc\nvQ/g8Btv6H+efFKSNGTECP37rl26Z8oUy6kA3MyfgkH96tFH5UWj+uchQ/Qfb7yhf3FwHtDX+wCM\nZgA3k5eXp7ffflv5+fmqr6/X4MGDb7v+vyTdnZenB197TZeam+W7915O/oAj7powQQ//13+p9eRJ\n3ZmbqxwHT/4mjFpAGzduVHl5uU6dOqWlS5dqz549CgaDCgaDkqRJkyYpKytLy5cv1/bt2/X444/H\nJXSi8Y8bp9wHH1RLbq6GT51qO06fmV5O2kZ+u1zMP2T4cI382td0fswY5dx/v+04/c7oCqC4uPim\na27Xkz4AuI5nAQGA43gWEAAgJhSAOHKxB3ot8ttFfntczm6CAgAASYoZAAA4jhkAACAmFIA4cr2P\nSH67yG+Py9lNUAAAIEkxAwAAxzEDAADEhAIQR673EclvF/ntcTm7CQoAACQpZgAA4DhmAACAmFAA\n4sj1PiL57SK/PS5nN0EBAIAkxQwAABzHDAAAEBMKQBy53kckv13kt8fl7CYoAACQpJgBAIDjmAEA\nAGJCAYgj1/uI5LeL/Pa4nN1EmukbhMNhVVZWKhqNavbs2SosLOzyemtrq15++WW1tLQoGo3qoYce\n0qxZs0x3CwAwZDQDiEajKioqUnl5uXw+n0pLS1VUVKScnJzONb/4xS/U0dGhefPmqbW1VcXFxdq+\nfbtSU1O7vR8zAACInZUZQENDg7Kzs5WVlaW0tDTl5+crFAp1WXPnnXfq4sWLkqRLly5p6NChPZ78\nAQD9y6gARCIR+f3+zm2fz6dIJNJlzZw5c3Ty5EktXrxYK1as0IIFC0x2mdBc7yOS3y7y2+NydhO3\nfAhcXV2tUaNGadu2bfrxj3+snTt36tKlS9ddf+0Pora21qnturq6hMpD/sTKR362b+V2XxjNAOrr\n61VVVaWysjJJfzvZp6SkdBkEv/jii5o7d67uvfdeSdIPfvADzZ8/X7m5ud3ejxkAAMTOygwgNzdX\njY2NampqUkdHh/bv36+8vLwua4YPH666ujpJUktLi06dOqW7777bZLcAgDgwKgCpqalauHChKioq\nVFJSounTpysnJ0fBYFDBYFCSNHfuXP3xj3/UihUrtHbtWn3729/WkCFD4hI+0ZhejtlGfrvIb4/L\n2U0Y3wcQCAQUCAS6fK2goKDzzxkZGVq5cqXpbgAAccazgADAcTwLCAAQEwpAHLneRyS/XeS3x+Xs\nJigAAJCkmAEAgOOYAQAAYkIBiCPX+4jkt4v89ric3QQFAACSFDMAAHAcMwAAQEwoAHHkeh+R/HaR\n3x6Xs5ugAABAkmIGAACOYwYAAIgJBSCOXO8jkt8u8tvjcnYTFAAASFLMAADAccwAAAAxoQDEket9\nRPLbRX57XM5uggIAAEmKGQAAOI4ZAAAgJhSAOHK9j0h+u8hvj8vZTaSZvkE4HFZlZaWi0ahmz56t\nwsLCbms+/vhjvfrqq7p69aqGDh2q1atXm+4WAGDIaAYQjUZVVFSk8vJy+Xw+lZaWqqioSDk5OZ1r\n2tvbVV5errKyMvn9frW2tiojI6PH92MGAACxszIDaGhoUHZ2trKyspSWlqb8/HyFQqEua2prazV1\n6lT5/X5Juu7JHwDQv4wKQCQS6TyxS5LP51MkEumy5vTp02pra9OaNWu0cuVKvfvuuya7TGiu9xHJ\nbxf57XE5u4lbPgS+evWqjh8/rtLSUpWVlemXv/ylTp8+fd311/4gamtrndquq6tLqDzkT6x85Gf7\nVm73hdEMoL6+XlVVVSorK5MkVVdXKyUlpcsguKamRn/5y1/0rW99S5K0detWTZw4UdOmTev2fswA\nACB2VmYAubm5amxsVFNTkzo6OrR//37l5eV1WTNlyhQdPXpU0WhUV65c0bFjx7oMiQEAdhgVgNTU\nVC1cuFAVFRUqKSnR9OnTlZOTo2AwqGAwKEkaMWKEvvzlL+uZZ57RqlWrNGfOnNu2AJhejtlGfrvI\nb4/L2U0Y3wcQCAQUCAS6fK2goKDL9sMPP6yHH37YdFcAgDjiWUAA4DieBQQAiAkFII5c7yOS3y7y\n2+NydhMUAABIUswAAMBxzAAAADGhAMSR631E8ttFfntczm6CAgAASYoZAAA4jhkAACAmFIA4cr2P\nSH67yG+Py9lNUAAAIEkxAwAAxzEDAADEhAIQR673EclvF/ntcTm7CQoAACQpZgAA4DhmAACAmFAA\n4sj1PiL57SK/PS5nN0EBAIAkxQwAABzHDAAAEBMKQBy53kckv13kt8fl7CaMC0A4HFZxcbGeeuop\n1dTUXHddQ0ODHnnkEf3+97833SUAIA6MCkA0GtXOnTu1atUqbdiwQe+9955OnjzZ47rXX39dEydO\nVAKNHOJuxowZtiMYIb9d5LfH5ewmjApAQ0ODsrOzlZWVpbS0NOXn5ysUCnVb99Zbb2natGnKyMgw\n2R0AII6MCkAkEpHf7+/c9vl8ikQi3daEQiE98MADkqSUlBSTXSY01/uI5LeL/Pa4nN3ELR8CV1ZW\nat68eUpJSZHneTdtAV37g6itrXVqu66uLqHykD+x8pGf7Vu53RdG9wHU19erqqpKZWVlkqTq6mql\npKSosLCwc82TTz7ZedK/cOGCBgwYoMWLFysvL6/b+3EfAADErq/3AaSZ7DQ3N1eNjY1qamqSz+fT\n/v37VVRU1GXNpk2bOv+8ZcsWTZ48uceTPwCgfxm1gFJTU7Vw4UJVVFSopKRE06dPV05OjoLBoILB\nYLwyOsP0csw28ttFfntczm7C6ApAkgKBgAKBQJevFRQU9Lh22bJlprsDAMQJzwICAMfxLCAAQEwo\nAHHkeh+R/HaR3x6Xs5ugAABAkmIGAACOYwYAAIgJBSCOXO8jkt8u8tvjcnYTFAAASFLMAADAccwA\nAAAxoQDEket9RPLbRX57XM5uggIAAEmKGQAAOI4ZAAAgJhSAOHK9j0h+u8hvj8vZTVAAACBJMQMA\nAMcxAwAAxIQCEEeu9xHJbxf57XE5uwkKAAAkKWYAAOA4ZgAAgJhQAOLI9T4i+e0ivz0uZzeRZvoG\n4XBYlZWVikajmj17tgoLC7u8vm/fPu3evVue52nQoEFatGiRRo4cabpbAIAhoxlANBpVUVGRysvL\n5fP5VFpaqqKiIuXk5HSuqa+vV05OjtLT0xUOh1VVVaWKiooe348ZAADEzsoMoKGhQdnZ2crKylJa\nWpry8/MVCoW6rBk3bpzS09MlSWPGjNHZs2dNdgkAiBOjAhCJROT3+zu3fT6fIpHIddfv2bNHgUDA\nZJcJzfU+IvntIr89Lmc30W9D4EOHDmnv3r2aP3/+Dddd+4Oora11aruuri6h8pA/sfKRn+1bud0X\nRjOA+vp6VVVVqaysTJJUXV2tlJSUboPgEydOaN26dSorK1N2dvZ1348ZAADEzsoMIDc3V42NjWpq\nalJHR4f279+vvLy8Lmuam5u1bt06LV++/IYnfwBA/zIqAKmpqVq4cKEqKipUUlKi6dOnKycnR8Fg\nUMFgUJL05ptvqr29XTt27NCzzz6r0tLSuARPRKaXY7aR3y7y2+NydhPG9wEEAoFug92CgoLOPy9Z\nskRLliwx3Q0AIM54FhAAOI5nAQEAYkIBiCPX+4jkt4v89ric3QQFAACSFDMAAHAcMwAAQEwoAHHk\neh+R/HaR3x6Xs5ugAABAkmIGAACOYwYAAIgJBSCOXO8jkt8u8tvjcnYTFAAASFLMAADAccwAAAAx\noQDEket9RPLbRX57XM5uggIAAEmKGQAAOI4ZAAAgJhSAOHK9j0h+u8hvj8vZTVAAACBJMQMAAMcx\nAwAAxIQCEEeu9xHJbxf57XE5u4k00zcIh8OqrKxUNBrV7NmzVVhY2G3Nrl27FA6HNWDAAC1btkyj\nR4823S0AwJDRFUA0GtXOnTu1atUqbdiwQe+9955OnjzZZc2BAwd05swZvfTSS3riiSe0Y8cOo8CJ\n6tChzzVwYK6OHTtrO0qfzZgxw3YEI+S3y9X8nx8+rC8NGKCmI0dsR+l3RgWgoaFB2dnZysrKUlpa\nmvLz8xUKhbqsCYVCmjlzpiRp7Nixam9vV0tLi8luE8777/9Zjz76Kz3wQJV27Dikw4c/tx0JQC+c\nDoUUXLpUb/7bv+n3a9fqzEcf2Y7Ur4wKQCQSkd/v79z2+XyKRCI3XOP3+7utcd0bbxzRn//cJkna\nvv0jnThxwXKivnG9D0p+u1zM3/jhh2quq5MkHf/v/1bkk08sJ+pf/TIEjuWTptf+EtXW1ib89uXL\nlzV06D93fi0lRRowIDVh8sWyXVdXl1B5yJ9Y+W7H/Gnp6bpW2oABCZUvlu2+MLoPoL6+XlVVVSor\nK5MkVVdXKyUlpcsg+JVXXtGECROUn58vSSouLtbq1auVmZnZ7f1cvQ8gFGrU+vUf6Nixc1q6NKA5\nc76oUaO6//0AJJbGgwdVt2OHTv3v/2rsf/6ncgsLdfeECbZjxayv9wEYfQooNzdXjY2Nampqks/n\n0/79+1VUVNRlTV5ent5++23l5+ervr5egwcP7vHk77K8vGxt2DBHra1XNHJkugYOHGg7EoBeyA4E\nNOj55/WXc+c02H+X0u/y3/ybbiNGLaDU1FQtXLhQFRUVKikp0fTp05WTk6NgMKhgMChJmjRpkrKy\nsrR8+XJt375djz/+eFyCJ5p77hmszz8/7PTJ3/Ry0jby2+Vq/mFZWTr6+edJd/KX4nAfQCAQUCAQ\n6PK1goKCLtu360kfAFzGs4AAwHE8CwgAEBMKQBy52gP9O/LbRX57XM5uggIAAEmKGQAAOI4ZAAAg\nJhSAOHK9j0h+u8hvj8vZTVAAACBJMQMAAMcxAwAAxIQCEEeu9xHJbxf57XE5uwkKAAAkKWYAAOA4\nZgAAgJhQAOLI9T4i+e0ivz0uZzdBAQCAJMUMAAAcxwwAABATCkAcud5HJL9d5LfH5ewmKAAAkKSY\nAQCA45gBAABi0ucC0NbWprVr16qoqEgvvPCC2tvbu61pbm7WmjVr9PTTT+u73/2ufv3rXxuFTXSu\n9xHJbxf57XE5u4m0vn5jTU2N7rvvPn3jG99QTU2NampqNH/+/K5vnpamxx57TKNGjdLly5f1ve99\nT/fdd59ycnKMgwMAzPT5CiAUCmnmzJmSpFmzZunDDz/stiYzM1OjRo2SJA0cOFAjRozQuXPn+rrL\nhDdjxgzbEYyQ3y7y2+NydhN9LgDnz59XZmamJGnYsGE6f/78Ddc3NTXpT3/6k8aOHdvXXQIA4uiG\nLaC1a9eqpaWl29cfffTRLtspKSk33Mnly5e1YcMGLViwQAMHDuxDTDfU1tY6/V8S5LeL/Pa4nN3E\nDQtAeXn5dV8bNmyYWlpalJmZqXPnzmnYsGE9ruvo6ND69et1//336ytf+coNw2RmZurAgQO9iJ2Y\n0tPTyW8R+e1yOb/L2SV1dmNi1echcF5ent555x0VFhbqd7/7naZMmdJtjed52rp1q0aMGKEHH3zw\npu85efLkvsYBAMSozzeCtbW16ac//amam5v1hS98QSUlJRo8eLAikYi2bdum0tJSffLJJ3r++ef1\nxS9+sbNNNG/ePE2cODGufwkAQOwS6k5gAED/4U5gAEhSFAAASFJ9HgLHw/XmCNdqbm7W5s2bdf78\neaWkpGjOnDn6+te/binx34TDYVVWVioajWr27NkqLCzstmbXrl0Kh8MaMGCAli1bptGjR1tI2rOb\n5d+3b592794tz/M0aNAgLVq0SCNHjrSUtrve/PtLUkNDg5577jmVlJRo6tSp/ZyyZ73J/vHHH+vV\nV1/V1atXNXToUK1evbr/g17HzfK3trbq5ZdfVktLi6LRqB566CHNmjXLTth/sGXLFh08eFAZGRla\nv359j2sS+bi9Wf4+HbeeRa+99ppXU1PjeZ7nVVdXez/72c+6rTl37px3/Phxz/M879KlS95TTz3l\nffbZZ/0Zs4urV696Tz75pHfmzBnvr3/9q/fMM890y/OHP/zB++EPf+h5nufV19d7q1atshG1R73J\nf/ToUa+9vd3zPM87ePCgc/n/vm716tXeiy++6L3//vsWknbXm+xtbW1eSUmJ19zc7Hme550/f95G\n1B71Jv/Pf/5z7/XXX/c872/Zv/Od73gdHR024nZz+PBh79NPP/WefvrpHl9P5OPW826evy/HrdUW\nkIuPk2hoaFB2draysrKUlpam/Px8hUKhLmuu/XuNHTtW7e3tPd5QZ0Nv8o8bN07p6emSpDFjxujs\n2bM2ovaoN/kl6a233tK0adOUkZFhIWXPepO9trZWU6dOld/vlyTn8t955526ePGiJOnSpUsaOnSo\nUlNTbcTtZvz48d06DNdK5ONWunn+vhy3VguAi4+TiEQinQenJPl8PkUikRuu8fv93dbY0pv819qz\nZ48CgUB/ROuV3v77h0IhPfDAA5Jufqd6f+lN9tOnT6utrU1r1qzRypUr9e677/Z3zOvqTf45c+bo\n5MmTWrx4sVasWKEFCxb0c8q+S+TjNla9PW5v+QwgWR8n4d0Gn649dOiQ9u7dq7Vr19qOEpPKykrN\nmzdPKSkp8jzPqZ/F1atXdfz4cX3/+9/XlStX9Nxzz2ns2LG65557bEfrlerqao0aNUqrV69WY2Oj\nXnjhBf3kJz/RoEGDbEfrFZd+V64nluP2lheA/n6cxK3m8/m6XFqdPXtWPp8v5jW29DbbiRMntG3b\nNpWVlWnIkCH9GfGGepP/008/1caNGyVJFy5cUDgcVlpamvLy8vo16z/qTXa/36+hQ4fqjjvu0B13\n3KHx48frxIkTCVEAepO/vr5ec+fOlaTOdtGpU6eUm5vbr1n7IpGP296K9bi12gL6++MkJMXtcRK3\nWm5urhobG9XU1KSOjg7t37+/24klLy+v89K9vr5egwcP7vOzOuKtN/mbm5u1bt06LV++XNnZ2ZaS\n9qw3+Tdt2qTNmzdr8+bNmjZtmhYtWmT95C/1LvuUKVN09OhRRaNRXblyRceOHUuY/39Gb/IPHz5c\ndXV1kqSWlhadOnVKd999t424MUvk47Y3+nLcWr0T2NXHSRw8eLDLR+Hmzp2rYDAoSSooKJAk7dy5\nU+FwWAMHDtTSpUv1pS99yVref3Sz/Fu3btUHH3ygu+66S5KUmpqqF1980WbkLnrz7/93W7Zs0eTJ\nkxPmY6C9yb5792698847CfOx52vdLH9ra6u2bNmis2fPKhqNau7cuQnzlM2NGzfqyJEjam1tVWZm\npr75zW/q6tWrktw4bm+Wvy/HLY+CAIAkxZ3AAJCkKAAAkKQoAACQpCgAAJCkKAAAkKQoAACQpCgA\nAJCkKAAAkKT+D0uP3fCPfpF8AAAAAElFTkSuQmCC\n",
"text": [
"<matplotlib.figure.Figure at 0x4618110>"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Obviously, we don't see very well the points, they're all stacked. Let's spread them, just to see the distribution (that affects only the visualization)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x1 = [x + random.gauss(0, 0.1) for x in X[:, 0]]\n",
"x2 = [x + random.gauss(0, 0.1) for x in X[:, 1]]\n",
"pl.scatter(x1, x2, c=y)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
"<matplotlib.collections.PathCollection at 0x450aa50>"
]
},
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAECCAYAAAD3vwBsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X1AFHX+B/D3CvL85K48I4IoSeYDiqeJnYqhaT7gQ+VT\nV6k92Glgl6WRnaae9TvtvFLT1M7u6npAxSy1pNCUNBOFQjMRH/CBJ2Hl+Znd3x/qBgfo7M7Ozqy8\nX38x7Hdn3u6O82W+n5nvqPR6vR5ERNTmtJM7ABERyYMdABFRG8UOgIiojWIHQETURrEDICJqo9gB\nEBG1UbZiV7B+/XqkpaXBzc0Nq1evbvb6oUOHsGvXLuj1ejg6OmL27Nno3Lmz2M0SEZFIKrH3AZw+\nfRoODg5Yu3Ztix1AZmYmAgIC4OTkhPT0dCQkJGDFihViNklERGYgeggoLCwMzs7Orb4eGhoKJycn\nAEDXrl1RVFQkdpNERGQGFq0BJCcnIzw83JKbJCKiVlisAzh58iT279+P6dOnW2qTRER0G6KLwEJk\nZ2dj48aNiI+Ph4uLS6vt9u3bBxsbG0tEIiK6a3h4eKBfv35Gv0/yDqCwsBCrVq3CvHnz4OPjc9u2\nNjY26Nu3r9SRiIjuKidOnDDpfaKHgNasWYPFixcjJycHc+bMQXJyMpKSkpCUlAQA2LZtGyoqKrB5\n82a8/PLLWLRokdhNWlxKSorcEZphJmGYSTgl5mImaYk+A4iLi7vt68899xyee+45sZshIiIzE30f\ngDl99913HAIiIjLSiRMnMHz4cKPfx6kgiIjaKHYAAihxzI+ZhGEm4ZSYi5mkxQ6AiKiNYg2AiMjK\nsQZARERGYQcggBLH/JhJGGYSTom5jMlUfO4czn7xBS4fOoTa8nJFZFI6i0wFQUQkpdLLl7Fr6lQU\nZ2UBAEZs2IDujz4qcyrlYw2AiKxebmoqEkaMMCx3GT0aYz76SMZElsUaABmtoa4OVVotGurq5I5C\nJIqzlxdcAgIMy8GjRsmYxnqwAxBAiWN+YjNVFhTg8Btv4LOoKPy4ciUqr12TPZMUmEk4JeYSmskt\nMBAxCQkYsXEjxickIGTMGNkzWQPWAO4CuoYG6BsaYGNnJ/g9uampSFu3DgBwfM0a+A0YgOCRI6WK\nSG1c4cmTKM/Lg1tgINShoZJsQ33PPVDfc48k675bsQMQYPDgwXJHaOZWpqLffsPhN95ATUkJBi9d\nCp+ICEHvb6itve2ymExKwkzCSZUrPz0d28eMQX1lJRw1Gkz88ktouneXNZMYSsxkKg4BWbG6ykrs\nf+klXPj6a+QcOYJdU6agPCenxbZlV68iMzERZ7ZvR+nly/Dp2xedhg0DAARFR8OLj+okiRSePIn6\nykoAQFVRkeFKHZIfOwABlDjml5KSgobaWlTm5xt+V1Nc3OJf8rUVFTi8fDm+njUL3zz9NL5/5RXY\nu7vjoU2b8KfUVIzYsAFujQpoYjIpDTMJJ1Uut86dDT+rbGzg7Osr+L1K/KyUmMlUHAKyYg4eHvjj\nypXYPWMGGmprMfTvf2/xP1dtWRmybz6gBwAuJSejtrQUrp06wVGttmRkaoN8+vVDzI4dKPj5Z/j2\n7w/PXr3kjkQ38T4AK6fX6VB8/jx09fVwDwqCrYNDszb1VVX44Y038PPGjQCAe6dPxx9XroTdbZ7P\nTGRJ2sxM1JSUwC0wEM7e3nLHsTqm3gfAMwArp2rXDh26dr1tG1tHR0S8+CIChw2DvqEB3uHhPPiT\nrCry89FQVwcXHx8UpKcjceJE1JWXIzAqCg+++y5cjBgmItOxBiCAEsf8jM3k7OmJ4BEj0GXUKDj7\n+CgikyUwk3CWypV34gQ+jYrCRwMGIGvXLlw6cAB1N+fuuZScjOLz582eqfTyZRSePo3q69dFr0up\n358p2AEQkUn0Oh1yfvwRR//+d5zfuxc1paV3fE9dVRW+f+UVVOTmor6qCt88+2yTmkC79u1h7+Zm\n1pzXTp7E59HR+G9kJA699hoqCwvNun5rxhoAEZmk4Oef8fmIEdDdnEpk7KefIrjRfDy3FP76K7KT\nk+Hi44OABx7A3pkzkXPkCADAxs4Oj333HbKTk1GQno4ejz+OgAceQDsbG7PlPPT660hbu9awPPGr\nrxAwaJDZ1q8ErAEQkUVVFhQYDv4AoD1zplkHUJKdjZ2TJhkuVx70179iyJtvYt+cOagpLsaw1auh\nCQtDxx49WtxGlVaL2rIyOHToYPKZQZMhT5UK7Z2cfl9/URFyjhxBeV4e/AcNQsd77zVpG9aKQ0AC\nWGrMr6a0FPknTqDw1Kk7TtDWWqbr587hwMsvY/eTTyI/PV2KmEZnkhMzCWdsLo8uXQwTsNk4OMC/\nhb+qq69fb3KvyqX9+9GxRw9M+uorTNm/H0EjRkDVruXDUMnFi9j9pz/hw759sf+ll1Cem2tUvlu6\njh2LnrNnwys8HKO2bIGm0UE+MzERu//0J3z/8svYOXEiii9evOP6lPr9mULUGcD69euRlpYGNzc3\nrF69usU2H3zwAdLT02Fvb4/nn38ewcHBYjZ516otL8eJd9/FsdWroWrXDqP+9S90HTvWqHU01NXh\nyPLlyPriCwDA1ZQUTDlwwCw3eRH9L4+QEExITETpxYtw8vJCx/vua9bGxc8P/oMH4+rNg+Z9TzwB\nVbt2cPDwuOP6c44eRc7hwwCAzG3bEDpxoklXB7l16oShb72F+upqtHd0bPLapeRkw8+VBQWoLioC\ngoKM3oa1EnUGMGzYMLz66qutvn7ixAnk5+fjnXfewTPPPIPNmzeL2ZxsLDH3R0VeHo7d7ET1Oh1+\nfPPN2xbVWsrUUFOD62fPGpartVrUV1WZP6wRmeTGTMKZkqtDSAg6Dx8Oz549oVKpmr3u7OWFEe+9\nh/EJCXjk66/RxYhpmm3at7/tsjFUKlWzgz8AdH/sMcPPnr17C7pLWanfnylEnQGEhYWhoKCg1ddT\nU1MxZMgQAEC3bt1QUVGB4uJieAjo/dsaW0dHOHp6ourmtMzq0NAWb+q6HTsXF9wfH489TzwBXX09\n+i9YABc/PyniEgnm6u8PV39/o9/nO3Ag7nviCVz89luETZ0Krz59zJ4teMQITN6zB9XFxdCEhcG1\njf1/kbQGoNVqodFoDMsajQZarVbKTUrCEmN+rv7+GJ+QgNDJk9FnzhwMWrz4ttM7N85UfP48fvv8\nc5zbswfe4eGY8v33eCw5GX3nzoWds7Pk2VvKpBTMJJzScrn6+UE9Ywam7N+PPyxYAMdGxxJzsXV0\nhN/Agejy0ENwbzRn0e0o7XMSQ/KrgIy9yjQlJcVwinXrg5Z7uXE2Kbd3tqwM3nPmIPzmzJxC3q9x\ncEDaSy+h8JdfAAB9X3wRkfHxUKlUFv+8MjIyLLo9IcsZGRmKytOYUvIo/fvr06+fYvI0ppQ8Yoak\nRN8HUFBQgLfeeqvFIvD777+PHj16IDIyEgAQFxeHJUuWtDoExPsAjFd0+jQ+vvn5AkDH++7D5L17\nLfqXP5HcynJykPfTT9DrdPDp3x9unTrJHcmiFPlM4IiICBw8eBAAkJmZCWdnZ47/m5ljx47wa9QB\n3PPoozz4U5tSW1GBH//2N+ydORNfz56Ng4sWCbormUR2AGvWrMHixYuRk5ODOXPmIDk5GUlJSUi6\nOfVw37594eXlhXnz5mHTpk2YNWuWWUJbmhLH/G5lcvL0xIh16/Dwv/+NmO3b0WPGDNkzKQkzCafE\nXEIy1ZWV4cLXXxuWs7/9FrUlJbJmshaiagBxcXF3bGOtB31r4hYYCLfAQLljEMnC3t0d9zzyiGG6\n89BJk2B/c6Sh5NIlFJ48CTtXV86C2wLOBUREVq+ioAAFaWnQ63Tw6tMHLr6+KM/Lw+7HH0f+8eMA\ngOHvvCPrGbKUOBcQEbVZzl5eCB45ssnvKgsKDAd/ADj9yScImzrVrBPNWTvOBSSAEsf8mEkYZhJO\nibnEZHJUq+HeaFqH4FGjzHLwV+LnZCqeARDRXck1IABjP/0UuceOwVGthk///nJHUhzWAIiIrJwi\n7wMgIiLlYgcggBLH/JhJGGYSTom5mEla7ACIiNoo1gCIiKwcawBERGQUdgACKHHMr6VMVUVFuPDN\nNzizfTuKL1xQRCa5MZNwSszFTNLifQB3kZMffogjy5cDuPF4u3Gffgpnb2+ZUxGRUrEGcJeoq6zE\nttGjce3mg2EAYPoPP0ATFiZjKiKyBNYA2rj2Tk4InTjRsOzdv78kj9AjorsHOwABlDjm11Kme2fM\nwLjPP8eoDz7AQ++/DycvL9kzyY2ZhFNiLmaSFmsAClZ29Sp09fVw8fODTfv2d2zvqFYj6MEHLZCM\niO4GrAEoVO5PP+HLqVNRW1aG4e+8g9BJkwR1AkTU9rAGcBepq6zE9wsXovr6dejq6/Ht3Lkozc6W\nOxYR3WXYAQhg6TE/Vbt2aN/o0XXt7Oyg+p95zJU4DslMwigxE6DMXMwkLXYACmTr4IAhK1fCJyIC\n7l26YMzHHzd5sAURkTmwBqBgteXlaKithaNaLfg9xRcuoPzKFTh5e0MdGiphOiJSCj4T+C5k12gY\nSIjrZ88iceJElF+9CnsPD0z84gt49uwpUToisnYcAhJAiWN+LWUqOnMG5VevAgBqiotR8PPPsmeS\nGzMJp8RczCQt0WcA6enp2Lp1K3Q6HaKiohATE9Pk9dLSUrz77rsoLi6GTqfD2LFjMXToULGbpRY4\neXoCKhVwc1TPxc9P5kREpGSiagA6nQ6xsbFYvHgx1Go1Fi1ahNjYWAQEBBjafP7556ivr8e0adNQ\nWlqKuLg4bNq0CTb/c1ULwBqAWPXV1bh6+DCyv/0Wfvffj8ChQ2Hn6ip3LCKSmCw1gKysLPj4+MDr\n5pQDkZGRSE1NbdIBdOjQAdk3r2GvqqqCq6triwd/Es/WwQGdo6LQOSpK7ihEZAVE1QC0Wi00jSYc\nU6vV0Gq1TdoMHz4cV65cwbPPPosFCxbgySefFLNJWShxzI+ZhGEm4ZSYi5mkJXkRODExEUFBQdi4\ncSP+7//+D1u2bEFVVVWr7Rt/uCkpKVy2ouWMjAxF5UlJSUFGRoai8ih5md+fdS+bQlQNIDMzEwkJ\nCYiPjwdw42CvUqmaFIJXrlyJCRMmoHv37gCAN954A9OnT0dISEiz9bEGQERkPFnmAgoJCUFeXh4K\nCgpQX1+Pw4cPIyIiokkbPz8/Qy9eXFyMnJwcePMpVUREshPVAdjY2GDmzJlYsWIF5s+fj0GDBiEg\nIABJSUlISkoCAEyYMAHnzp3DggULsGzZMsyYMQMuRt7gJDexp1lSYCZhmEk4JeZiJmmJvg8gPDwc\n4eHhTX4XHR1t+NnNzQ0LFy4UuxkiIjIzzgVERGTl+DwAIiIyCjsAAZQ45sdMwjCTcErMxUzSYgdA\nRNRGsQZARGTlWAMgIiKjsAMQQIljfswkDDMJp8RczCQtdgBERG0UawBERFaONQAiIjIKOwABlDjm\nx0zCMJNwSszFTNJiB0BE1EaxBkBEZOVYAyAiIqOwAxBAiWN+zCQMMwmnxFzMJC12AEREbRRrAERE\nVo41ACIiMgo7AAGUOObHTMIwk3BKzMVM0mIHQETURrEGQERk5VgDICIio7ADEECJY37MJAwzCafE\nXMwkLVuxK0hPT8fWrVuh0+kQFRWFmJiYZm1OnTqFDz/8EA0NDXB1dcWSJUvEbpaIiEQSVQPQ6XSI\njY3F4sWLoVarsWjRIsTGxiIgIMDQpqKiAosXL0Z8fDw0Gg1KS0vh5ubW4vpYAyAiMp4sNYCsrCz4\n+PjAy8sLtra2iIyMRGpqapM2KSkpGDBgADQaDQC0evAnIiLLEtUBaLVaw4EdANRqNbRabZM2ubm5\nKC8vx9KlS7Fw4UIcPHhQzCZlocQxP2YShpmEU2IuZpKW6BrAnTQ0NODChQt4/fXXUVNTg9deew3d\nunWDr69vi+1TUlIwePBgw88AZF9unE0JeZS6nJGRoag8KSkpyMjIUFSexpSSh9/f3fH9mUJUDSAz\nMxMJCQmIj48HACQmJkKlUjUpBO/cuRO1tbV49NFHAQAbNmxAnz59MHDgwGbrYw2AiMh4stQAQkJC\nkJeXh4KCAtTX1+Pw4cOIiIho0qZ///44c+YMdDodampqcPbs2SZFYiIikoeoDsDGxgYzZ87EihUr\nMH/+fAwaNAgBAQFISkpCUlISAMDf3x+9e/fGSy+9hFdffRXDhw+3ug5AiWN+zCQMMwmnxFzMJC3R\nNYDw8HCEh4c3+V10dHST5XHjxmHcuHFiN0VERGbEuYCIiKwc5wIiIiKjsAMQQIljfswkDDMJp8Rc\nzCQtdgBERG0UawBERFaONQAiIjIKOwABlDjmx0zCMJNwSszFTNJiB0BE1EaxBkBEZOVYAyAiIqOw\nAxBAiWN+zCQMMwmnxFzMJC12AEREbRRrAEREVo41ACIiMgo7AAGUOObHTMIwk3BKzMVM0mIHQETU\nRrEGQERk5VgDICIio7ADEECJY37MJAwzCafEXMwkLXYARERtFGsARERWjjUAIiIyCjsAAZQ45sdM\nwjCTcErMxUzSEt0BpKenIy4uDi+88AJ27tzZarusrCxMmTIFR48eFbtJIiIyA1E1AJ1Oh9jYWCxe\nvBhqtRqLFi1CbGwsAgICmrVbtmwZ7O3tMXToUAwcOLDF9bEGQERkPFlqAFlZWfDx8YGXlxdsbW0R\nGRmJ1NTUZu327t2LgQMHws3NTczmiIjIjER1AFqtFhqNxrCsVquh1WqbtUlNTcWIESMAACqVSswm\nZaHEMT9mEoaZhFNiLmaSluRF4K1bt2LatGlQqVTQ6/W404hT4w83JSWFy1a0nJGRoag8KSkpyMjI\nUFQeJS/z+7PuZVOIqgFkZmYiISEB8fHxAIDExESoVCrExMQY2sydO9dw0C8rK4O9vT2effZZRERE\nNFsfawBERMYztQZgK2ajISEhyMvLQ0FBAdRqNQ4fPozY2NgmbdauXWv4ef369ejXr1+LB38iIrIs\nUUNANjY2mDlzJlasWIH58+dj0KBBCAgIQFJSEpKSksyVUXZiT7OkwEzCMJNwSszFTNISdQYAAOHh\n4QgPD2/yu+jo6BbbPv/882I3R0REZsK5gIiIrBznAiIiIqOwAxBAiWN+zCQMMwmnxFzMJC12AERE\nbRRrAEREVo41ACIiMgo7AAGUOObHTMIwk3BKzMVM0mIHQETURrEGQERk5VgDICIio7ADEECJY37M\nJAwzCafEXMwkLXYARERtFGsARERWjjUAIiIyCjsAAZQ45sdMwjCTcErMxUzSYgdARNRGsQZARGTl\nWAMgIiKjsAMQQIljfswkDDMJp8RczCQtdgBERG0UawBERFaONQAiIjIKOwABlDjmx0zCMJNwSszF\nTNKyFbuC9PR0bN26FTqdDlFRUYiJiWny+qFDh7Br1y7o9Xo4Ojpi9uzZ6Ny5s9jNEhGRSKJqADqd\nDrGxsVi8eDHUajUWLVqE2NhYBAQEGNpkZmYiICAATk5OSE9PR0JCAlasWNHi+lgDICIyniw1gKys\nLPj4+MDLywu2traIjIxEampqkzahoaFwcnICAHTt2hVFRUViNklERGYiqgPQarXQaDSGZbVaDa1W\n22r75ORkhIeHi9mkLJQ45sdMwjCTcErMxUzSslgR+OTJk9i/fz+mT59+23aNP9yUlBQuW9FyRkaG\novKkpKQgIyNDUXmUvKz07+/YsWPISE9XVD4lLZtCVA0gMzMTCQkJiI+PBwAkJiZCpVI1KwRnZ2dj\n1apViI+Ph4+PT6vrYw2AiFpSXVyM0598grOJieg6bhzCpk2Do1otdyzFkKUGEBISgry8PBQUFKC+\nvh6HDx9GREREkzaFhYVYtWoV5s2bd9uDPxFRa/KOH8eh+HjkpaYi5fXXkXf8uNyR7gqiOgAbGxvM\nnDkTK1aswPz58zFo0CAEBAQgKSkJSUlJAIBt27ahoqICmzdvxssvv4xFixaZJbgliT3NMpVer0da\nWj4SEn7D0aM5qK1tEJ2poUGHwsJKVFbWmSumgVyf0+0wk3BKzHUrU11ZWZPf15aWyhEHgDI/J1OJ\nvg8gPDy8WWE3Ojra8PNzzz2H5557Tuxm2qSMjGsYM2Y7qqrqoVIBX301Cfff72/y+ioqavH552ew\nZk0q+vTxwtKlgxEU5G7GxETS8OzdG569euHaL79A06MHvK3wYhIl4lxACrZ79zk8/vhuw/I//hGF\nJ564z+T1HTuWi5EjEwzLr78+CHFxEbd5B5FylOfloaqwEI4aDVx8feWOoyim1gBEnwGQdDp3doOz\nc3tUVNTBxkaF7t3FFb3q6nRNllsaBsrNLcf+/ZeQl1eBhx4Kxr33dhS1TSJzcfHxgQvriGbFuYAE\nkGvM7777PLF792Rs2fIQ9uyZjL59vUVlCgtT489/DodKBYSGdsDkyfc0a/Ovf2Vg7txvsXz5EUya\ntBOXLgkfa1Xi2CgzCafEXMwkLZ4BKFyvXp7o1cvTLOvq0MERCxcOwKxZveDkZAsvL+cmr9fXN+DQ\noSuG5fz8SpSU1Jhl20SkPKwBUBM7dmRi9uyvAQAPPtgZ69ZFw9PTSeZURHQ7rAGQWYwe3QX79j2C\nsrJadO+u4cGfLKbgl19QkJ4O14AA+PbvDztXV7kj3fVYAxBAiWN+UmVycLBFRIQvhg3rDF9fF0Vk\nEoOZhJMzV9GZM9gxdiyS4+LwxeTJuHzwoOyZWqPETKZiB0BEsqvMz0dto5u9co4elTFN28EagMzy\n8ipw7FguGhr0iIjwRkCAm9yRiCyu+Px57Bg/HuVXrwIqFWK2bUPgsGFyx7IarAFYoerqerz99k/Y\nvPnGjIcjRwbhvfdGwMPDQeZkROZRW14OvV4P+zuM53t06YKYHTtw/exZ2Ks7orC9P37bdwFhYRp0\n6sQ/iqTCISABpBrzKy2twVdfnTcsJyVlC77sUonjkMwkjBIzAebPlXf8OLaNHo0dY8Yg/8SJO7ZX\nd+uGkNGjceSyG6Kit2PKlC/x9NN7kZdXYdZcYin1+zMFOwAZubnZIyamm2H54Ye78K9/uitUFhRg\n71NPofDkSVzLyMDXs2ejsrDwju9raNDhP/85ZVj+6ad8xXUAdxMOAQkwePBgSdbr4GCLuLgIPPBA\nAOrrdejb1xvu7vayZhKDmYRRYibAvLl09fWoLS83LNeWl0NfX3/H99nYtMPw4Z2RknIVAODv7wK1\nWll/FCn1+zMFOwCZeXk5YdSoLq2+XlJSjaKiari4tG925y6RUjn7+iJ6/XrsfeopAMCDa9fCydv7\nDu+6Ydq0exEc7I6CgkoMHhyAwEDWAKTCISABhIz5XbpUil27zmLfvgsoKqoyy3YLCioRH38IERH/\nRkxMIs6e/f15y0och2QmYZSYCTBvLpVKheARIzDjxx8x4+hRBEVHQ6VSCXqvp6cTxo3rhtmze6Ow\n8LTZMpmLUr8/U/AMwAyKiqrwwgvf4uDBG/PoLFw4AH/5S3/Y2IjrX0+dKsR//3vjP8Bvv2mRlJSN\nbt34GDyyDqp27eDeubPcMeg2eAYgwJ3G/K5frzYc/AFg+/ZMlJfXit6uvb1Nk2VX1/aCM8mBmYRR\nYiZAmbmYSVrsAMzAw8MeAwf6GZbHjQuBi4ud6PX27OmJN9/8I7p0cceMGfdi6NBA0eskshS9Xo/i\n4mrU1t65+EvyYAcgwJ3G/Dp2dMJ770Vjy5aH8N//jsEzz/QRPfwDAK6udpg9uzf27XsUq1YNbXJD\njBLHIZlJGCVmAprmqiwqQvH586gpLjZpXZWVdfjoo18xcmQCYmOTkZ1t2jN8lfhZKTGTqVgDMJPO\nnd3RuXPz5+vq9XocO5aLffsu4t57NRg6NBBqtaPg9bZrpzKqPZFYxefP45tnnkH+iRPoNnEiHli+\n3OgncZ06VYjY2O8AAGfPXkevXp6YM4fP8VUadgACiBnzO3WqEOPHJ6KmpgEAsGnTSEya1PxJXJbM\nJBVmEkaJmYDfc1394QfDnbtnd+xA90cfNboDuLW/32Lqg4WU+FkpMZOpOAQksaKi6ib/GU6evPPd\nkOZSVVWPS5dKUFDAOylJOBuHpjde2dgZX88KC1Pjqad6AgCCg92b3PFOysEOQAAxY37Bwe6Gh7nb\n29tg5Mhgk9dVW9uAY8dy8eWXWTh8+Oxt25aV1WLduhPo1+/fePjh7Th9usjk7QqlxLFRZhLuVi7/\nyEj0nDkT7kFBuH/xYnj36WP0ujQaJ/z1r4Nw7Njj2L17Mrp314jKpCRKzGQq0UNA6enp2Lp1K3Q6\nHaKiohATE9OszQcffID09HTY29vj+eefR3Cw6QdBaxMY6IZPPx2HCxdKoFY7oEePjiav69ixXIwf\nnwidTg+NxhG7d2sQGtryfQFnzmjxt7/9CAA4d64YW7b8glWrOL0u3Zmrnx8eWLECdeXlsHd3Rztb\n0w4Tbm72cHMTNrUJyUPUGYBOp8OWLVvw6quv4u2338YPP/yAK1euNGlz4sQJ5Ofn45133sEzzzyD\nzZs3iwosB7FjfoGBbhgypBN69vREu3bC7oZsyY8/5kCnu/H4hqKiKly61PqVFba2Tbfj6Ch9uUeJ\nY6PMJFzjXLb29nDUaEw++JuLEj8rJWYylagOICsrCz4+PvDy8oKtrS0iIyORmprapE1qaiqGDBkC\nAOjWrRsqKipQbOKlZeZy7Volrl2rlDWDKcLDf59LxcWlPfz8Wn9kY/fuGrzzznD4+7sgKioQTzxx\nnyUiEpEVEdUBaLVaaDS/j+2p1WpotdrbttFoNM3aWNIPP1xBVNSnePDBz/DjjzmC3qOUMb+BA/2w\nc+cErF37ID766EHce2/rw0kODraYOjUMyclTsHXraHTt2kHyfEr5nBpjJuHkyHX93Dmc/Pe/kZmY\niPLcXOh1OtRXV8ua6U6UmMlUFjm/M+apkykpKYZTrFsftLmWU1N/w6xZKSgouPHX/9NPf42tW+9H\nv35ht31/42ymbD8yMhJHjuTg449PIiysAyZPvhc+Pi5Gr+/EiaNo1w6YNm0wUlJSkJKS32r70tIa\nnDlzEUBJQvBkAAAPG0lEQVQl+vcPl+Tz/N/ljIwMSddvynJGRoai8jSmlDxyfX+nU1OR+sILuP7b\nbwCAEZs34+I336Dk/HkMfOUV5Lu58fszYtkUop4JnJmZiYSEBMTHxwMAEhMToVKpmhSC33//ffTo\n0QORkZEAgLi4OCxZsgQeHh7N1if1M4Fzc8vxxz9+Ypit09fXGfv3T4WXl5Nk2wSAX38txPDhnxku\nB33rrSF4+unekm3v4sUSLFiwH0eP5mLWrF6YO7cvNBreTEbKoj1zBh/dfz8AQB0WBu/wcJz+738B\nAO3at8fUAwegCQuTM6LVMPWZwKKGgEJCQpCXl4eCggLU19fj8OHDiIiIaNImIiICBw8eBHCjw3B2\ndm7x4G8Jvr4u2LRpJNRqB2g0jtiwYYTkB3/gxk0wje8F+PVXae8FSE6+hO++u4Ty8jr885/HcfLk\nNUm3R8pWX1uL3J9+wrmvvoI2M1PuOAaOHTsiaORIAICdszOqi36/VFlXV4f6GtNuHmtNXWUlrqSk\n4PRnn6Hgl1/Mum5rJWoIyMbGBjNnzsSKFSsMl4EGBAQgKSkJABAdHY2+ffsiLS0N8+bNg4ODA+bM\nmWOW4KYaOjQQhw5Ng0oF+Pi0XkRtrPGwlCmCgtwxZEgnfP/9ZTg42ODRR7ubvC4hmQROu252Yj8n\nKTATkPPDD9g5eTKg18PZxwcTv/wSHUJCZM/lqNEgavVqXHvySdg6O8OmfXtcPXIEtaWl6PfCC3AP\nDjZrppyjR/HFpEkAADtXVzzy9dcmnWEocZ8ylegaQHh4OMLDm87xER0d3WR51qxZYjdjVr6+wg78\n5tze+vXRyM4ugbu7vck3xQg1bFggoqM748cfczFrVk/07Okp6fZI2S59/z1wc6S3Ii8P5VeutNgB\nyMHFzw8ufr/PpDv1wAHUV1XBxd8f9m7mfRJY3rFjhp9ry8pQnpvb5oeYRNUAzE2KGkBhYSXy8irg\n5mZvlY+Wu3y5FGlp+XBwsEXfvt7o2FHYkFVpaQ3Ky2vRoYOjRe4BIOU6t2cPds+YAQCwc3PDI998\nA8094uejsjaXDx1CYkwMoNfDQa3G5D17oA4NlTuWWZhaA7irjwy5ueV48cVkfPPNRXh7O2H79pjb\nXjqpNFptFebPT0Zy8iUAwF/+0h8vv/wHtG9vc4d38i5M+l2nP/4RMdu3oywnB149e1rs4F9VVITC\nU6egsrGB5333wd69+Wy5luQ7YAAm79mDirw8qEND75qDvxh39VxAZ85o8c03FwEA+fmV2Lv3vEnr\nkeu63+LiGsPBHwB27MhEWVmtSZlqaupx8uQ1ZGRcQ1WVNA/oUOL10cwE2Lm4IHDYMPSYPh2evXq1\n2s6cuWorKpD6j38gMSYGO8aOxS+bN6O+1vin5Jkzk62dHfwGDEC38eNFDf0ocZ8y1V3dAbi4tG+y\nLLToqxQeHvYYPNjfsDx2bAhcXY2fmbGhQYcvvsjCkCGfYMiQT/DJJ7/yKU0kqWqtFj+//75hOf39\n91Et4w2g1LK7ugZQXV2PPXvOY9OmnzF4cACeeqrnbadPUKJLl0pw/PiNGkC/ft7w8nJute3588W4\ndq0Svr4uTeod+fkVGDLkE8MNcPb2Njh27HEEBFhfTYSsQ5VWi6+mT0fu0aMAgJAxYzDivffQ3rn1\n/ZdMxxpACxwcbDFxYijGjg0RNG6uFDqdHpculUKvBwIDXREYeOex09OnCxETk4hr16rQtasHPvlk\nHEJCbtxv4eBgi06d3AwdQKdOrnBwuKu/epKZo1qN6HXrcGHfPti0b4/ODz5otoO/rqEB7Wys5/+z\nkt3VQ0C3iD34W3LMT6/X49tvL+L++z/CwIH/wd695w0zgN4u04kT+bh27cYdzllZxTh9+vebzdzd\n7bF27XA89lh3TJjQDf/612jBVxMZQ4ljo8wknLlzeXTpgvDnnkOvWbPg3rmz6EzVxcX4ZcsW7Bg3\nDr988AGqS0rMFdXkTNaOfwYqTEFBJeLikg13Dv/5z9/i8GEf+Pvffujqf4eG/nfqh3vu0eC990aY\nNyyRBeWlpuLAggUAgJwjR+AeFITOUVEyp7Ju7AAEsORdf+3bt4Orqx3y8m48xtHV1a7Z3P4tZfrD\nH3ywfn009u/PxpgxXdG7t5dF8t4ukxIwk3BKzNU4U21p0+df1Mh0BqDEz8lU7AAURq12xPvvj8RL\nL+1HXZ0Oq1cPg7f3ncdO3d0dMGVKGKZMadt3NtLdy6t3b6hDQ6HNzIS6e3d49ZZuQsW2ok3UAMQy\ndczv3Lnr2LUrCz/8cAUVFXWC39e7txe++GIidu+ehH79fMyaSUrMJIwSMwHKzNU4k0dICMZv344p\nycmI2bYNHl26yJ7J2vEMQCKXLpXiscd24fz5G6epW7eOwrhx3QS/38mp/Z0bEbUxrv7+cPX3v3ND\nEuSuvg9ATkeP5mDUqG2G5UmTQrFp00MyJiKiu5UszwOg1nl5OcPb+/dLLYcPN+0yOCIiqbADEMCU\nMb/gYHfs2DEB69dHIyFhPEaNCpY9k9SYSRglZgKUmYuZpMUagITCwjQIC5N27n8iIlOxBkBEZOVY\nAyAiIqOwAxBAiWN+zCQMMwmnxFzMJC12AEREbRRrAEREVo41ACIiMorJHUB5eTmWLVuG2NhYLF++\nHBUVFc3aFBYWYunSpXjxxRfxl7/8BXv27BEVVi5KHPNjJmGYSTgl5mImaZl8H8DOnTvRq1cvjB8/\nHjt37sTOnTsxffr0piu3tcUTTzyBoKAgVFdX45VXXkGvXr0QEBAgOjgREYlj8hlAamoqhgwZAgAY\nOnQojh071qyNh4cHgoKCAAAODg7w9/fH9evXTd2kbJQ4/zczCcNMwikxFzNJy+QOoKSkBB4eN545\n6+7ujpI7PJyhoKAAFy9eRLduwmfEJCIi6dx2CGjZsmUoLi5u9vupU6c2WVapmj+xqrHq6mq8/fbb\nePLJJ+Hg4GBCTHmlpKQortdnJmGYSTgl5mImaZl8GWhcXByWLFkCDw8PXL9+HUuXLsWaNWuatauv\nr8dbb72FPn364OGHH77tOo8fP95ih0NERK3z8PBAv379jH6fyUXgiIgIHDhwADExMfj+++/Rv3//\nZm30ej02bNgAf3//Ox78AZj0DyAiItOYfAZQXl6Of/zjHygsLISnpyfmz58PZ2dnaLVabNy4EYsW\nLcJvv/2Gv/71rwgMDDQME02bNg19+vQx6z+CiIiMp6g7gYmIyHJ4JzARURvFDoCIqI2S9YlgrdUR\nWqLT6bBw4UKo1WosXLhQ9lyFhYVYt24dSkpKoFKpMHz4cIwePdrsWdLT07F161bodDpERUUhJiam\nWZsPPvgA6enpsLe3x/PPP4/gYPM+ftLYTIcOHcKuXbug1+vh6OiI2bNno3NnaZ+JLORzAoCsrCy8\n9tprmD9/PgYMGCB7plOnTuHDDz9EQ0MDXF1dsWTJElkzlZaW4t1330VxcTF0Oh3Gjh2LoUOHSppp\n/fr1SEtLg5ubG1avXt1iG0vv43fKJMc+LuRzAozcx/Uy+s9//qPfuXOnXq/X6xMTE/UfffRRq22/\n/PJL/T//+U/9m2++qYhc169f11+4cEGv1+v1VVVV+hdeeEF/+fJls+ZoaGjQz507V5+fn6+vq6vT\nv/TSS822cfz4cf3f/vY3vV6v12dmZupfffVVs2YwJdOZM2f0FRUVer1er09LS1NEplvtlixZol+5\ncqX+yJEjsmcqLy/Xz58/X19YWKjX6/X6kpIS2TN99tln+o8//tiQ56mnntLX19dLmuvXX3/Vnz9/\nXv/iiy+2+Lql93EhmSy9jwvJpNcbv4/LOgQkZDoJACgqKkJaWhqioqKgt0DNWinTXGRlZcHHxwde\nXl6wtbVFZGQkUlNTW83arVs3VFRUSHovhZBMoaGhcHJyAgB07doVRUVFkuURmgkA9u7di4EDB8LN\nzU3SPEIzpaSkYMCAAdBobjw3WupcQjJ16NABlZWVAICqqiq4urrCxsZG0lxhYWGtnvkDlt/HhWSy\n9D4uJBNg/D4uawcgdDqJDz/8EDNmzEC7dpaJq5RpLrRareHgAABqtRparfa2bTQaTbM2ls7UWHJy\nMsLDwyXLIzSTVqtFamoqRowYAeDOd69bIlNubi7Ky8uxdOlSLFy4EAcPHpQ90/Dhw3HlyhU8++yz\nWLBgAZ588klJMwlh6X3cWJbYx4UwZR+XvAYgdjqJ48ePw83NDcHBwTh16pRict2ihGkuLHFWZIqT\nJ09i//79WLZsmdxRsHXrVkybNg0qlQp6vV4Rn1lDQwMuXLiA119/HTU1NXjttdfQrVs3+Pr6ypYp\nMTERQUFBWLJkCfLy8rB8+XL8/e9/h6Ojo2yZAO7jQpiyj0veASxevLjV19zd3VFcXGyYTsLd3b1Z\nmzNnzuD48eNIS0tDXV0dqqqqsHbtWsydO1fWXMCNaS5Wr16NBx54AH/4wx9E5WmJWq1ucmpZVFQE\ntVptdBtLZwKA7OxsbNy4EfHx8XBxcZEsj9BM58+fN0xVUlZWhvT0dNja2iIiIkK2TBqNBq6urrCz\ns4OdnR3CwsKQnZ0tWQcgJFNmZiYmTJgAAIbhopycHISEhEiSSQhL7+NCWXIfF8KUfVzWIaBb00kA\naHU6iWnTpuG9997DunXrEBcXhx49eog++Jsjl97IaS5MERISgry8PBQUFKC+vh6HDx9u9mVGREQY\nhg4yMzPh7OxsGL6SK1NhYSFWrVqFefPmwcfHR7IsxmRau3Yt1q1bh3Xr1mHgwIGYPXu2ZAd/oZn6\n9++PM2fOQKfToaamBmfPnpX0WRlCMvn5+SEjIwMAUFxcjJycHHh7e0uWSQhL7+NCWHofF8KUfVzW\nO4GFTCfR2K+//oovv/wSr7zyiuy5LDXNRVpaWpPL9iZMmICkpCQAQHR0NABgy5YtSE9Ph4ODA+bM\nmYMuXbqYNYOxmTZs2ICffvoJHTt2BADY2Nhg5cqVsmZqbP369ejXr5/kl4EKybRr1y4cOHBA0kuJ\njclUWlqK9evXo6ioCDqdDhMmTJB85ss1a9bg9OnTKC0thYeHBx555BE0NDQYMgGW38fvlEmOfVzI\n53SL0H2cU0EQEbVRvBOYiKiNYgdARNRGsQMgImqj2AEQEbVR7ACIiNoodgBERG0UOwAiojaKHQAR\nURv1/48opZQj633pAAAAAElFTkSuQmCC\n",
"text": [
"<matplotlib.figure.Figure at 0x4642910>"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, all points having both attributes to 0 ([0, 0]) are blue. There are a couple blue points having attributes [0, 1] and [1, 0], but most are red. \n",
"\n",
"Let's train a classifier on the above dataset. Remember, $X$ is a list of feature vectors and $y$ the target label for each vector, i.e. the ground-truth."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn.linear_model import LogisticRegression\n",
"\n",
"logit = LogisticRegression()\n",
"logit.fit(X, y)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 7,
"text": [
"LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
" intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have a trained classifier, let's predict new values:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print logit.predict(np.array([0,0]))\n",
"print logit.predict(np.array([1,0]))\n",
"print logit.predict(np.array([0,1]))\n",
"print logit.predict(np.array([1,1]))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[0]\n",
"[1]\n",
"[1]\n",
"[1]\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pretty straighforward, but let's see what happen when we assign weights."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"weights = [0.3, 1.0]\n",
"logit.fit(X * weights, y)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
"LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
" intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print logit.predict(np.array([0, 0]) * weights)\n",
"print logit.predict(np.array([1, 0]) * weights)\n",
"print logit.predict(np.array([0, 1]) * weights)\n",
"print logit.predict(np.array([1, 1]) * weights)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[0]\n",
"[0]\n",
"[1]\n",
"[1]\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first feature is now much less important as observed by the fact that all feature vectors having the first attribute set, but not the second i.e. [1, 0] are now classified as 0. Previously, these were classified as 1.\n",
"\n",
"By assigning such a small weight, I've essentially nullified the effect of the first feature. This might seem kind of contrived, but in a more realistic problem, where the feature space would have many dimensions, the effect of the weights would be more subtle."
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment