nishimotz/py35-161002.ipynb

## py35-161002.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 「機械学習とPythonとの出会い」との出会い\n",
    "\n",
    "\n",
    "## 2016年10月2日（日曜）\n",
    "\n",
    "## @24motz (Takuya Nishimoto)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# PyCon mini Hiroshima 2016\n",
    "\n",
    "\n",
    "http://hiroshima.pycon.jp\n",
    "\n",
    "* 2016年11月12日（土曜）\n",
    "* 発表者・参加者を募集中\n",
    "* 共催 IoTLT広島"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 「機械学習とPythonとの出会い」\n",
    "\n",
    "## オリジナル\n",
    "\n",
    "* https://github.com/tkamishima/mlmpy\n",
    "* http://www.kamishima.net/mlmpyja/\n",
    "\n",
    "## 関連文献\n",
    "\n",
    "* オライリー「実践 機械学習システム」\n",
    "* オライリー「Pythonによるデータ分析入門」\n",
    "* 技術評論社「科学技術計算のためのPython入門」"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# とにかくやってみる\n",
    "\n",
    "* nbayes1.py を Python 3 対応（xrange を range にする）\n",
    "\n",
    "## 単純ベイズ カテゴリ特徴 とは？\n",
    "\n",
    "* 「実践 機械学習システム」第6章のナイーブベイズ"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 単語列からネガティブ／ポジティブを予測\n",
    "\n",
    "単純な規則では判定できないと仮定\n",
    "\n",
    "```\n",
    "awesome ----- => posi\n",
    "awesome ----- => posi\n",
    "awesome crazy => posi\n",
    "------- crazy => posi\n",
    "------- crazy => nega\n",
    "------- crazy => nega\n",
    "\n",
    "awesome crazy => posi\n",
    "awesome crazy => nega\n",
    "------- ----- => posi\n",
    "------- ----- => nega\n",
    "```\n",
    "\n",
    "* 下の4つは確率が半々"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# カテゴリで定義\n",
    "\n",
    "## 特徴\n",
    "\n",
    "* 単語なし = 0\n",
    "* 単語あり = 1\n",
    "\n",
    "## クラス\n",
    "\n",
    "* nega = 0\n",
    "* posi = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 気持ちが単語を生成する\n",
    "\n",
    "* posi モデルと nega モデルがある\n",
    "* posi モデルはある確率で awesome/crazy を生成する\n",
    "* nega モデルはある確率で awesome/crazy を生成する\n",
    "\n",
    "## 知りたいこと（予測）\n",
    "\n",
    "* awesome/crazy がある（ない）場合の posi/nega である確率\n",
    "* 確率が高い方を推定結果とする\n",
    "\n",
    "## 過去に起きたこと（統計）\n",
    "\n",
    "* posi だったときに awesome/crazy があったか（なかったか）\n",
    "* nega だったときに awesome/crazy があったか（なかったか）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# ベイズの定理\n",
    "\n",
    "http://www.kamishima.net/mlmpyja/nbayes1/nbayes.html\n",
    "\n",
    "* 式(1)右辺: (posi,negaの起こりやすさ) * (各単語の出現確率の積)\n",
    "* 式(4): (クラスの正規確率) = (そのクラスに属するコーパス数) / (全コーパス数)\n",
    "* 式(5): クラス nega の場合の単語 awesome の出現確率 = (awesome かつ nega のコーパス数) / (negaのコーパス数)\n",
    "* 式(6)左辺: 事後確率 => (awesome/crazy の有無を観測した場合の posi/nega である確率)\n",
    "* 式(6)右辺: 事前確率 => (posi/nega である確率) * (posi/nega である場合にawesome/crazy が有/無である確率)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\t0\t1\r\n",
      "1\t0\t1\r\n",
      "1\t1\t1\r\n",
      "0\t1\t1\r\n",
      "0\t1\t0\r\n",
      "0\t1\t0\r\n",
      "1\t1\t1\r\n",
      "1\t1\t0\r\n",
      "0\t0\t1\r\n",
      "0\t0\t0\r\n"
     ]
    }
   ],
   "source": [
    "%cat tweet2.tsv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from nbayes1 import NaiveBayes1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "data = np.genfromtxt('tweet2.tsv', dtype=np.int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 3)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0, 1],\n",
       "       [1, 0, 1],\n",
       "       [1, 1, 1],\n",
       "       [0, 1, 1],\n",
       "       [0, 1, 0],\n",
       "       [0, 1, 0],\n",
       "       [1, 1, 1],\n",
       "       [1, 1, 0],\n",
       "       [0, 0, 1],\n",
       "       [0, 0, 0]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "X=data[:, :-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0],\n",
       "       [1, 0],\n",
       "       [1, 1],\n",
       "       [0, 1],\n",
       "       [0, 1],\n",
       "       [0, 1],\n",
       "       [1, 1],\n",
       "       [1, 1],\n",
       "       [0, 0],\n",
       "       [0, 0]])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 2)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "y=data[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10,)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 1, 1, 1, 0, 0, 1, 0, 1, 0])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "clr = NaiveBayes1()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "clr.fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "predict_y=clr.predict(X[:, :])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 1 1\n",
      "1 1 1\n",
      "2 1 1\n",
      "3 1 0\n",
      "4 0 0\n",
      "5 0 0\n",
      "6 1 1\n",
      "7 0 1\n",
      "8 1 1\n",
      "9 0 1\n"
     ]
    }
   ],
   "source": [
    "for i in range(len(y)):\n",
    "    print(i, y[i], predict_y[i])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 1 1\n",
      "1 1 1\n",
      "2 1 1\n",
      "3 1 0\n",
      "4 0 0\n",
      "5 0 0\n",
      "6 1 1\n",
      "7 0 1\n",
      "8 1 1\n",
      "9 0 1\n"
     ]
    }
   ],
   "source": [
    "for i, yi in enumerate(y):\n",
    "    print(i, yi, predict_y[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 考察\n",
    "\n",
    "* 間違っているのは 3, 7, 9 の3件\n",
    "* 70% の正解率\n",
    "* ただしこれはクローズドテスト（学習データ＝評価データ）\n",
    "* scikit-learn BernoulliNB"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 記述と実行の効率\n",
    "\n",
    "* ファンシーインデックス参照（整数配列で参照）\n",
    "* ブールインデックス参照（ブール値配列で参照）\n",
    "* ユニバーサル関数(ufunc) frompyfunc/vectorize で作成できる \n",
    "* ブロードキャスト（配列の形状をあわせて演算）\n",
    "* Cython"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# まとめ\n",
    "\n",
    "* 解ける、解かれた、解けそう（定式化）\n",
    "* 言語の仕様や高速化技術 (numpy)\n",
    "* スライド作成環境 (jupyterでLaTeX)\n",
    "\n",
    "## 2つのPython\n",
    "\n",
    "* 2 と 3\n",
    "* pip と conda\n"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 「機械学習とPythonとの出会い」との出会い\n",
	"\n",
	"\n",
	"## 2016年10月2日（日曜）\n",
	"\n",
	"## @24motz (Takuya Nishimoto)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# PyCon mini Hiroshima 2016\n",
	"\n",
	"\n",
	"http://hiroshima.pycon.jp\n",
	"\n",
	"* 2016年11月12日（土曜）\n",
	"* 発表者・参加者を募集中\n",
	"* 共催 IoTLT広島"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 「機械学習とPythonとの出会い」\n",
	"\n",
	"## オリジナル\n",
	"\n",
	"* https://github.com/tkamishima/mlmpy\n",
	"* http://www.kamishima.net/mlmpyja/\n",
	"\n",
	"## 関連文献\n",
	"\n",
	"* オライリー「実践機械学習システム」\n",
	"* オライリー「Pythonによるデータ分析入門」\n",
	"* 技術評論社「科学技術計算のためのPython入門」"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# とにかくやってみる\n",
	"\n",
	"* nbayes1.py を Python 3 対応（xrange を range にする）\n",
	"\n",
	"## 単純ベイズカテゴリ特徴とは？\n",
	"\n",
	"* 「実践機械学習システム」第6章のナイーブベイズ"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 単語列からネガティブ／ポジティブを予測\n",
	"\n",
	"単純な規則では判定できないと仮定\n",
	"\n",
	"```\n",
	"awesome ----- => posi\n",
	"awesome ----- => posi\n",
	"awesome crazy => posi\n",
	"------- crazy => posi\n",
	"------- crazy => nega\n",
	"------- crazy => nega\n",
	"\n",
	"awesome crazy => posi\n",
	"awesome crazy => nega\n",
	"------- ----- => posi\n",
	"------- ----- => nega\n",
	"```\n",
	"\n",
	"* 下の4つは確率が半々"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# カテゴリで定義\n",
	"\n",
	"## 特徴\n",
	"\n",
	"* 単語なし = 0\n",
	"* 単語あり = 1\n",
	"\n",
	"## クラス\n",
	"\n",
	"* nega = 0\n",
	"* posi = 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 気持ちが単語を生成する\n",
	"\n",
	"* posi モデルと nega モデルがある\n",
	"* posi モデルはある確率で awesome/crazy を生成する\n",
	"* nega モデルはある確率で awesome/crazy を生成する\n",
	"\n",
	"## 知りたいこと（予測）\n",
	"\n",
	"* awesome/crazy がある（ない）場合の posi/nega である確率\n",
	"* 確率が高い方を推定結果とする\n",
	"\n",
	"## 過去に起きたこと（統計）\n",
	"\n",
	"* posi だったときに awesome/crazy があったか（なかったか）\n",
	"* nega だったときに awesome/crazy があったか（なかったか）"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# ベイズの定理\n",
	"\n",
	"http://www.kamishima.net/mlmpyja/nbayes1/nbayes.html\n",
	"\n",
	"* 式(1)右辺: (posi,negaの起こりやすさ) * (各単語の出現確率の積)\n",
	"* 式(4): (クラスの正規確率) = (そのクラスに属するコーパス数) / (全コーパス数)\n",
	"* 式(5): クラス nega の場合の単語 awesome の出現確率 = (awesome かつ nega のコーパス数) / (negaのコーパス数)\n",
	"* 式(6)左辺: 事後確率 => (awesome/crazy の有無を観測した場合の posi/nega である確率)\n",
	"* 式(6)右辺: 事前確率 => (posi/nega である確率) * (posi/nega である場合にawesome/crazy が有/無である確率)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"collapsed": false,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"1\t0\t1\r\n",
	"1\t0\t1\r\n",
	"1\t1\t1\r\n",
	"0\t1\t1\r\n",
	"0\t1\t0\r\n",
	"0\t1\t0\r\n",
	"1\t1\t1\r\n",
	"1\t1\t0\r\n",
	"0\t0\t1\r\n",
	"0\t0\t0\r\n"
	]
	}
	],
	"source": [
	"%cat tweet2.tsv"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [],
	"source": [
	"import numpy as np\n",
	"from nbayes1 import NaiveBayes1"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"data = np.genfromtxt('tweet2.tsv', dtype=np.int)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {
	"collapsed": false,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(10, 3)"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"data.shape"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"array([[1, 0, 1],\n",
	" [1, 0, 1],\n",
	" [1, 1, 1],\n",
	" [0, 1, 1],\n",
	" [0, 1, 0],\n",
	" [0, 1, 0],\n",
	" [1, 1, 1],\n",
	" [1, 1, 0],\n",
	" [0, 0, 1],\n",
	" [0, 0, 0]])"
	]
	},
	"execution_count": 6,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"data"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [],
	"source": [
	"X=data[:, :-1]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"array([[1, 0],\n",
	" [1, 0],\n",
	" [1, 1],\n",
	" [0, 1],\n",
	" [0, 1],\n",
	" [0, 1],\n",
	" [1, 1],\n",
	" [1, 1],\n",
	" [0, 0],\n",
	" [0, 0]])"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"X"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(10, 2)"
	]
	},
	"execution_count": 9,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"X.shape"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [],
	"source": [
	"y=data[:, -1]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(10,)"
	]
	},
	"execution_count": 11,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"y.shape"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"array([1, 1, 1, 1, 0, 0, 1, 0, 1, 0])"
	]
	},
	"execution_count": 12,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"y"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [],
	"source": [
	"clr = NaiveBayes1()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"clr.fit(X, y)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"predict_y=clr.predict(X[:, :])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {
	"collapsed": false,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"0 1 1\n",
	"1 1 1\n",
	"2 1 1\n",
	"3 1 0\n",
	"4 0 0\n",
	"5 0 0\n",
	"6 1 1\n",
	"7 0 1\n",
	"8 1 1\n",
	"9 0 1\n"
	]
	}
	],
	"source": [
	"for i in range(len(y)):\n",
	" print(i, y[i], predict_y[i])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {
	"collapsed": false,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"0 1 1\n",
	"1 1 1\n",
	"2 1 1\n",
	"3 1 0\n",
	"4 0 0\n",
	"5 0 0\n",
	"6 1 1\n",
	"7 0 1\n",
	"8 1 1\n",
	"9 0 1\n"
	]
	}
	],
	"source": [
	"for i, yi in enumerate(y):\n",
	" print(i, yi, predict_y[i])"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 考察\n",
	"\n",
	"* 間違っているのは 3, 7, 9 の3件\n",
	"* 70% の正解率\n",
	"* ただしこれはクローズドテスト（学習データ＝評価データ）\n",
	"* scikit-learn BernoulliNB"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# 記述と実行の効率\n",
	"\n",
	"* ファンシーインデックス参照（整数配列で参照）\n",
	"* ブールインデックス参照（ブール値配列で参照）\n",
	"* ユニバーサル関数(ufunc) frompyfunc/vectorize で作成できる \n",
	"* ブロードキャスト（配列の形状をあわせて演算）\n",
	"* Cython"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"collapsed": true,
	"slideshow": {
	"slide_type": "slide"
	}
	},
	"source": [
	"# まとめ\n",
	"\n",
	"* 解ける、解かれた、解けそう（定式化）\n",
	"* 言語の仕様や高速化技術 (numpy)\n",
	"* スライド作成環境 (jupyterでLaTeX)\n",
	"\n",
	"## 2つのPython\n",
	"\n",
	"* 2 と 3\n",
	"* pip と conda\n"
	]
	}
	],
	"metadata": {
	"celltoolbar": "Slideshow",
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.5.2"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 0
	}