SOVIETIC-BOSS88/lesson4-project.ipynb

## lesson4-project.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# FAST AI JOURNEY: COURSE V3. PART 1. LESSON 4. \n",
    "## Documenting my fast.ai journey: 20 YEARS OF GAMES PROJECT. COLLABORATIVE FILTERING AND TABULAR MODELS."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this new project, we will analyze the '20 Years of Games' Dataset, available on Kaggle, using what we have learned on collaborative filtering and tabular data.\n",
    "\n",
    "Every notebook starts with the following three lines; they ensure that any edits to libraries you make are reloaded here automatically, and also that any charts or images displayed are shown in this notebook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tabular Models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai import *\n",
    "from fastai.tabular import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting the Data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Steam Video Games Dataset isn't available on the [fastai dataset page](https://course.fast.ai/datasets) due to copyright restrictions. You can download it from Kaggle however. Let's see how to do this by using the [Kaggle API](https://github.com/Kaggle/kaggle-api) as it's going to be pretty useful to you if you want to join a competition or use other Kaggle datasets later on.\n",
    "\n",
    "First, install the Kaggle API by uncommenting the following line and executing it, or by executing it in your terminal (depending on your platform you may need to modify this slightly to either add `source activate fastai` or similar, or prefix `pip` with a path. Have a look at how `conda install` is called for your platform in the appropriate *Returning to work* section of https://course-v3.fast.ai/. (Depending on your environment, you may also need to append \"--user\" to the command.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "#! pip install kaggle --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then you need to upload your credentials from Kaggle on your instance. Login to kaggle and click on your profile picture on the top left corner, then 'My account'. Scroll down until you find a button named 'Create New API Token' and click on it. This will trigger the download of a file named 'kaggle.json'.\n",
    "\n",
    "Upload this file to the directory this notebook is running in, by clicking \"Upload\" on your main Jupyter page, then uncomment and execute the next two commands (or run them in a terminal)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "#! mkdir -p ~/.kaggle/\n",
    "#! mv kaggle.json ~/.kaggle/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You're all set to download the data from [20 Years of Games](https://www.kaggle.com/egrinstein/20-years-of-games/version/2)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "#! chmod 600 /home/jupyter/.kaggle/kaggle.json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "PosixPath('data/ign')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "path = Path('data/ign')\n",
    "path.mkdir(parents=True, exist_ok=True)\n",
    "path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "#! kaggle datasets download -d egrinstein/20-years-of-games -f ign.csv -p {path}\n",
    "#! unzip -q -n {path}/ign.csv.zip -d {path}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tabular data should be in a Pandas `DataFrame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>score_phrase</th>\n",
       "      <th>title</th>\n",
       "      <th>url</th>\n",
       "      <th>platform</th>\n",
       "      <th>score</th>\n",
       "      <th>one_hot_score</th>\n",
       "      <th>genre</th>\n",
       "      <th>editors_choice</th>\n",
       "      <th>release_year</th>\n",
       "      <th>release_month</th>\n",
       "      <th>release_day</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Amazing</td>\n",
       "      <td>LittleBigPlanet PS Vita</td>\n",
       "      <td>/games/littlebigplanet-vita/vita-98907</td>\n",
       "      <td>PlayStation Vita</td>\n",
       "      <td>9.0</td>\n",
       "      <td>1</td>\n",
       "      <td>Platformer</td>\n",
       "      <td>Y</td>\n",
       "      <td>2012</td>\n",
       "      <td>9</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Amazing</td>\n",
       "      <td>LittleBigPlanet PS Vita -- Marvel Super Hero E...</td>\n",
       "      <td>/games/littlebigplanet-ps-vita-marvel-super-he...</td>\n",
       "      <td>PlayStation Vita</td>\n",
       "      <td>9.0</td>\n",
       "      <td>1</td>\n",
       "      <td>Platformer</td>\n",
       "      <td>Y</td>\n",
       "      <td>2012</td>\n",
       "      <td>9</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Great</td>\n",
       "      <td>Splice: Tree of Life</td>\n",
       "      <td>/games/splice/ipad-141070</td>\n",
       "      <td>iPad</td>\n",
       "      <td>8.5</td>\n",
       "      <td>1</td>\n",
       "      <td>Puzzle</td>\n",
       "      <td>N</td>\n",
       "      <td>2012</td>\n",
       "      <td>9</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Great</td>\n",
       "      <td>NHL 13</td>\n",
       "      <td>/games/nhl-13/xbox-360-128182</td>\n",
       "      <td>Xbox 360</td>\n",
       "      <td>8.5</td>\n",
       "      <td>1</td>\n",
       "      <td>Sports</td>\n",
       "      <td>N</td>\n",
       "      <td>2012</td>\n",
       "      <td>9</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Great</td>\n",
       "      <td>NHL 13</td>\n",
       "      <td>/games/nhl-13/ps3-128181</td>\n",
       "      <td>PlayStation 3</td>\n",
       "      <td>8.5</td>\n",
       "      <td>1</td>\n",
       "      <td>Sports</td>\n",
       "      <td>N</td>\n",
       "      <td>2012</td>\n",
       "      <td>9</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  score_phrase                                              title  \\\n",
       "0      Amazing                            LittleBigPlanet PS Vita   \n",
       "1      Amazing  LittleBigPlanet PS Vita -- Marvel Super Hero E...   \n",
       "2        Great                               Splice: Tree of Life   \n",
       "3        Great                                             NHL 13   \n",
       "4        Great                                             NHL 13   \n",
       "\n",
       "                                                 url          platform  score  \\\n",
       "0             /games/littlebigplanet-vita/vita-98907  PlayStation Vita    9.0   \n",
       "1  /games/littlebigplanet-ps-vita-marvel-super-he...  PlayStation Vita    9.0   \n",
       "2                          /games/splice/ipad-141070              iPad    8.5   \n",
       "3                      /games/nhl-13/xbox-360-128182          Xbox 360    8.5   \n",
       "4                           /games/nhl-13/ps3-128181     PlayStation 3    8.5   \n",
       "\n",
       "   one_hot_score       genre editors_choice  release_year  release_month  \\\n",
       "0              1  Platformer              Y          2012              9   \n",
       "1              1  Platformer              Y          2012              9   \n",
       "2              1      Puzzle              N          2012              9   \n",
       "3              1      Sports              N          2012              9   \n",
       "4              1      Sports              N          2012              9   \n",
       "\n",
       "   release_day  \n",
       "0           12  \n",
       "1           12  \n",
       "2           12  \n",
       "3           11  \n",
       "4           11  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(path/'ign.csv')\n",
    "df['one_hot_score'] = df['score'].map(lambda x: 0 if x <7 else 1)\n",
    "\n",
    "cols = list(df.columns.values)\n",
    "cols = ['score_phrase', 'title', 'url','platform','score','one_hot_score','genre',\n",
    "        'editors_choice','release_year','release_month', 'release_day',]\n",
    "\n",
    "df = df[cols]\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "dep_var = 'one_hot_score'\n",
    "\n",
    "#Here we do not include 'score_phrase' or 'editors_choice' as factors\n",
    "cat_names = ['title', 'platform', 'genre', \n",
    "             'release_year', 'release_month', 'release_day']\n",
    "\n",
    "\n",
    "procs = [FillMissing, Categorify, Normalize]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = (TabularList.from_df(df, path=path, cat_names=cat_names, procs=procs)\n",
    "                           .split_by_idx(list(range(800,1000)))\n",
    "                           .label_from_df(cols=dep_var)\n",
    "                           .add_test(test, label=0)\n",
    "                           .databunch())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>  <col width='10px'>  <col width='10px'>  <col width='10px'>  <col width='10px'>  <col width='10px'>  <col width='10px'>  <col width='10px'>  <tr>\n",
       "    <th>title</th>\n",
       "    <th>platform</th>\n",
       "    <th>genre</th>\n",
       "    <th>release_year</th>\n",
       "    <th>release_month</th>\n",
       "    <th>release_day</th>\n",
       "    <th>target</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Battlefield: Bad Company 2</th>\n",
       "    <th>PC</th>\n",
       "    <th>Shooter</th>\n",
       "    <th>2010</th>\n",
       "    <th>3</th>\n",
       "    <th>2</th>\n",
       "    <th>1</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Tetris Worlds</th>\n",
       "    <th>PC</th>\n",
       "    <th>Puzzle</th>\n",
       "    <th>2002</th>\n",
       "    <th>1</th>\n",
       "    <th>9</th>\n",
       "    <th>0</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>WWE SmackDown vs. Raw 2008</th>\n",
       "    <th>PlayStation Portable</th>\n",
       "    <th>Wrestling</th>\n",
       "    <th>2007</th>\n",
       "    <th>11</th>\n",
       "    <th>1</th>\n",
       "    <th>0</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Mortal Kombat: Shaolin Monks</th>\n",
       "    <th>PlayStation 2</th>\n",
       "    <th>Fighting, Action</th>\n",
       "    <th>2005</th>\n",
       "    <th>9</th>\n",
       "    <th>16</th>\n",
       "    <th>1</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Moon Diver</th>\n",
       "    <th>PlayStation 3</th>\n",
       "    <th>Action</th>\n",
       "    <th>2011</th>\n",
       "    <th>4</th>\n",
       "    <th>4</th>\n",
       "    <th>1</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Magic: The Gathering -- Duels of the Planeswalkers 2013</th>\n",
       "    <th>iPad</th>\n",
       "    <th>Card, Battle</th>\n",
       "    <th>2012</th>\n",
       "    <th>6</th>\n",
       "    <th>25</th>\n",
       "    <th>1</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Chop Chop Runner</th>\n",
       "    <th>iPhone</th>\n",
       "    <th>Action</th>\n",
       "    <th>2010</th>\n",
       "    <th>4</th>\n",
       "    <th>7</th>\n",
       "    <th>0</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Watchmen: The End is Nigh -- Part 2</th>\n",
       "    <th>PC</th>\n",
       "    <th>Action</th>\n",
       "    <th>2009</th>\n",
       "    <th>8</th>\n",
       "    <th>26</th>\n",
       "    <th>0</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Mario Bros.-e</th>\n",
       "    <th>Game Boy Advance</th>\n",
       "    <th>Platformer</th>\n",
       "    <th>2002</th>\n",
       "    <th>11</th>\n",
       "    <th>15</th>\n",
       "    <th>0</th>\n",
       "  </tr>\n",
       "  <tr>\n",
       "    <th>Serious Sam: The Second Encounter</th>\n",
       "    <th>PC</th>\n",
       "    <th>Shooter</th>\n",
       "    <th>2002</th>\n",
       "    <th>2</th>\n",
       "    <th>6</th>\n",
       "    <th>1</th>\n",
       "  </tr>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "data.show_batch(rows=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn = tabular_learner(data, layers=[200,100], metrics=accuracy)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TabularModel(\n",
       "  (embeds): ModuleList(\n",
       "    (0): Embedding(12442, 50)\n",
       "    (1): Embedding(60, 31)\n",
       "    (2): Embedding(113, 50)\n",
       "    (3): Embedding(23, 12)\n",
       "    (4): Embedding(13, 7)\n",
       "    (5): Embedding(32, 17)\n",
       "  )\n",
       "  (emb_drop): Dropout(p=0.0)\n",
       "  (bn_cont): BatchNorm1d(0, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
       "  (layers): Sequential(\n",
       "    (0): Linear(in_features=167, out_features=200, bias=True)\n",
       "    (1): ReLU(inplace)\n",
       "    (2): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
       "    (3): Linear(in_features=200, out_features=100, bias=True)\n",
       "    (4): ReLU(inplace)\n",
       "    (5): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
       "    (6): Linear(in_features=100, out_features=2, bias=True)\n",
       "  )\n",
       ")"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "learn.model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total time: 00:02\n",
      "epoch  train_loss  valid_loss  accuracy\n",
      "1      0.530561    0.692435    0.600000  (00:02)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "learn.fit(1, 1e-2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "score_phrase                                                Amazing\n",
       "title             LittleBigPlanet PS Vita -- Marvel Super Hero E...\n",
       "url               /games/littlebigplanet-ps-vita-marvel-super-he...\n",
       "platform                                           PlayStation Vita\n",
       "score                                                             9\n",
       "one_hot_score                                                     1\n",
       "genre                                                    Platformer\n",
       "editors_choice                                                    Y\n",
       "release_year                                                   2012\n",
       "release_month                                                     9\n",
       "release_day                                                      12\n",
       "Name: 1, dtype: object"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "row = df.iloc[1]\n",
    "row"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, tensor(0), tensor([0.9383, 0.0617]))"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "learn.predict(row)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# FAST AI JOURNEY: COURSE V3. PART 1. LESSON 4. \n",
	"## Documenting my fast.ai journey: 20 YEARS OF GAMES PROJECT. COLLABORATIVE FILTERING AND TABULAR MODELS."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"In this new project, we will analyze the '20 Years of Games' Dataset, available on Kaggle, using what we have learned on collaborative filtering and tabular data.\n",
	"\n",
	"Every notebook starts with the following three lines; they ensure that any edits to libraries you make are reloaded here automatically, and also that any charts or images displayed are shown in this notebook."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Tabular Models."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"from fastai import *\n",
	"from fastai.tabular import *"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Getting the Data."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The Steam Video Games Dataset isn't available on the [fastai dataset page](https://course.fast.ai/datasets) due to copyright restrictions. You can download it from Kaggle however. Let's see how to do this by using the [Kaggle API](https://github.com/Kaggle/kaggle-api) as it's going to be pretty useful to you if you want to join a competition or use other Kaggle datasets later on.\n",
	"\n",
	"First, install the Kaggle API by uncommenting the following line and executing it, or by executing it in your terminal (depending on your platform you may need to modify this slightly to either add `source activate fastai` or similar, or prefix `pip` with a path. Have a look at how `conda install` is called for your platform in the appropriate Returning to work section of https://course-v3.fast.ai/. (Depending on your environment, you may also need to append \"--user\" to the command.)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [],
	"source": [
	"#! pip install kaggle --upgrade"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Then you need to upload your credentials from Kaggle on your instance. Login to kaggle and click on your profile picture on the top left corner, then 'My account'. Scroll down until you find a button named 'Create New API Token' and click on it. This will trigger the download of a file named 'kaggle.json'.\n",
	"\n",
	"Upload this file to the directory this notebook is running in, by clicking \"Upload\" on your main Jupyter page, then uncomment and execute the next two commands (or run them in a terminal)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [],
	"source": [
	"#! mkdir -p ~/.kaggle/\n",
	"#! mv kaggle.json ~/.kaggle/"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"You're all set to download the data from [20 Years of Games](https://www.kaggle.com/egrinstein/20-years-of-games/version/2)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [],
	"source": [
	"#! chmod 600 /home/jupyter/.kaggle/kaggle.json"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"PosixPath('data/ign')"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"path = Path('data/ign')\n",
	"path.mkdir(parents=True, exist_ok=True)\n",
	"path"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [],
	"source": [
	"#! kaggle datasets download -d egrinstein/20-years-of-games -f ign.csv -p {path}\n",
	"#! unzip -q -n {path}/ign.csv.zip -d {path}"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Tabular data should be in a Pandas `DataFrame`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {
	"scrolled": false
	},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>score_phrase</th>\n",
	" <th>title</th>\n",
	" <th>url</th>\n",
	" <th>platform</th>\n",
	" <th>score</th>\n",
	" <th>one_hot_score</th>\n",
	" <th>genre</th>\n",
	" <th>editors_choice</th>\n",
	" <th>release_year</th>\n",
	" <th>release_month</th>\n",
	" <th>release_day</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>Amazing</td>\n",
	" <td>LittleBigPlanet PS Vita</td>\n",
	" <td>/games/littlebigplanet-vita/vita-98907</td>\n",
	" <td>PlayStation Vita</td>\n",
	" <td>9.0</td>\n",
	" <td>1</td>\n",
	" <td>Platformer</td>\n",
	" <td>Y</td>\n",
	" <td>2012</td>\n",
	" <td>9</td>\n",
	" <td>12</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>Amazing</td>\n",
	" <td>LittleBigPlanet PS Vita -- Marvel Super Hero E...</td>\n",
	" <td>/games/littlebigplanet-ps-vita-marvel-super-he...</td>\n",
	" <td>PlayStation Vita</td>\n",
	" <td>9.0</td>\n",
	" <td>1</td>\n",
	" <td>Platformer</td>\n",
	" <td>Y</td>\n",
	" <td>2012</td>\n",
	" <td>9</td>\n",
	" <td>12</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>Great</td>\n",
	" <td>Splice: Tree of Life</td>\n",
	" <td>/games/splice/ipad-141070</td>\n",
	" <td>iPad</td>\n",
	" <td>8.5</td>\n",
	" <td>1</td>\n",
	" <td>Puzzle</td>\n",
	" <td>N</td>\n",
	" <td>2012</td>\n",
	" <td>9</td>\n",
	" <td>12</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>Great</td>\n",
	" <td>NHL 13</td>\n",
	" <td>/games/nhl-13/xbox-360-128182</td>\n",
	" <td>Xbox 360</td>\n",
	" <td>8.5</td>\n",
	" <td>1</td>\n",
	" <td>Sports</td>\n",
	" <td>N</td>\n",
	" <td>2012</td>\n",
	" <td>9</td>\n",
	" <td>11</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td>Great</td>\n",
	" <td>NHL 13</td>\n",
	" <td>/games/nhl-13/ps3-128181</td>\n",
	" <td>PlayStation 3</td>\n",
	" <td>8.5</td>\n",
	" <td>1</td>\n",
	" <td>Sports</td>\n",
	" <td>N</td>\n",
	" <td>2012</td>\n",
	" <td>9</td>\n",
	" <td>11</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" score_phrase title \\\n",
	"0 Amazing LittleBigPlanet PS Vita \n",
	"1 Amazing LittleBigPlanet PS Vita -- Marvel Super Hero E... \n",
	"2 Great Splice: Tree of Life \n",
	"3 Great NHL 13 \n",
	"4 Great NHL 13 \n",
	"\n",
	" url platform score \\\n",
	"0 /games/littlebigplanet-vita/vita-98907 PlayStation Vita 9.0 \n",
	"1 /games/littlebigplanet-ps-vita-marvel-super-he... PlayStation Vita 9.0 \n",
	"2 /games/splice/ipad-141070 iPad 8.5 \n",
	"3 /games/nhl-13/xbox-360-128182 Xbox 360 8.5 \n",
	"4 /games/nhl-13/ps3-128181 PlayStation 3 8.5 \n",
	"\n",
	" one_hot_score genre editors_choice release_year release_month \\\n",
	"0 1 Platformer Y 2012 9 \n",
	"1 1 Platformer Y 2012 9 \n",
	"2 1 Puzzle N 2012 9 \n",
	"3 1 Sports N 2012 9 \n",
	"4 1 Sports N 2012 9 \n",
	"\n",
	" release_day \n",
	"0 12 \n",
	"1 12 \n",
	"2 12 \n",
	"3 11 \n",
	"4 11 "
	]
	},
	"execution_count": 7,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"df = pd.read_csv(path/'ign.csv')\n",
	"df['one_hot_score'] = df['score'].map(lambda x: 0 if x <7 else 1)\n",
	"\n",
	"cols = list(df.columns.values)\n",
	"cols = ['score_phrase', 'title', 'url','platform','score','one_hot_score','genre',\n",
	" 'editors_choice','release_year','release_month', 'release_day',]\n",
	"\n",
	"df = df[cols]\n",
	"\n",
	"df.head()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [],
	"source": [
	"dep_var = 'one_hot_score'\n",
	"\n",
	"#Here we do not include 'score_phrase' or 'editors_choice' as factors\n",
	"cat_names = ['title', 'platform', 'genre', \n",
	" 'release_year', 'release_month', 'release_day']\n",
	"\n",
	"\n",
	"procs = [FillMissing, Categorify, Normalize]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {},
	"outputs": [],
	"source": [
	"test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {},
	"outputs": [],
	"source": [
	"data = (TabularList.from_df(df, path=path, cat_names=cat_names, procs=procs)\n",
	" .split_by_idx(list(range(800,1000)))\n",
	" .label_from_df(cols=dep_var)\n",
	" .add_test(test, label=0)\n",
	" .databunch())"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {
	"scrolled": false
	},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<table> <col width='10px'> <col width='10px'> <col width='10px'> <col width='10px'> <col width='10px'> <col width='10px'> <col width='10px'> <tr>\n",
	" <th>title</th>\n",
	" <th>platform</th>\n",
	" <th>genre</th>\n",
	" <th>release_year</th>\n",
	" <th>release_month</th>\n",
	" <th>release_day</th>\n",
	" <th>target</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Battlefield: Bad Company 2</th>\n",
	" <th>PC</th>\n",
	" <th>Shooter</th>\n",
	" <th>2010</th>\n",
	" <th>3</th>\n",
	" <th>2</th>\n",
	" <th>1</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Tetris Worlds</th>\n",
	" <th>PC</th>\n",
	" <th>Puzzle</th>\n",
	" <th>2002</th>\n",
	" <th>1</th>\n",
	" <th>9</th>\n",
	" <th>0</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>WWE SmackDown vs. Raw 2008</th>\n",
	" <th>PlayStation Portable</th>\n",
	" <th>Wrestling</th>\n",
	" <th>2007</th>\n",
	" <th>11</th>\n",
	" <th>1</th>\n",
	" <th>0</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Mortal Kombat: Shaolin Monks</th>\n",
	" <th>PlayStation 2</th>\n",
	" <th>Fighting, Action</th>\n",
	" <th>2005</th>\n",
	" <th>9</th>\n",
	" <th>16</th>\n",
	" <th>1</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Moon Diver</th>\n",
	" <th>PlayStation 3</th>\n",
	" <th>Action</th>\n",
	" <th>2011</th>\n",
	" <th>4</th>\n",
	" <th>4</th>\n",
	" <th>1</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Magic: The Gathering -- Duels of the Planeswalkers 2013</th>\n",
	" <th>iPad</th>\n",
	" <th>Card, Battle</th>\n",
	" <th>2012</th>\n",
	" <th>6</th>\n",
	" <th>25</th>\n",
	" <th>1</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Chop Chop Runner</th>\n",
	" <th>iPhone</th>\n",
	" <th>Action</th>\n",
	" <th>2010</th>\n",
	" <th>4</th>\n",
	" <th>7</th>\n",
	" <th>0</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Watchmen: The End is Nigh -- Part 2</th>\n",
	" <th>PC</th>\n",
	" <th>Action</th>\n",
	" <th>2009</th>\n",
	" <th>8</th>\n",
	" <th>26</th>\n",
	" <th>0</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Mario Bros.-e</th>\n",
	" <th>Game Boy Advance</th>\n",
	" <th>Platformer</th>\n",
	" <th>2002</th>\n",
	" <th>11</th>\n",
	" <th>15</th>\n",
	" <th>0</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>Serious Sam: The Second Encounter</th>\n",
	" <th>PC</th>\n",
	" <th>Shooter</th>\n",
	" <th>2002</th>\n",
	" <th>2</th>\n",
	" <th>6</th>\n",
	" <th>1</th>\n",
	" </tr>\n",
	"</table>\n"
	],
	"text/plain": [
	"<IPython.core.display.HTML object>"
	]
	},
	"metadata": {},
	"output_type": "display_data"
	}
	],
	"source": [
	"data.show_batch(rows=10)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {},
	"outputs": [],
	"source": [
	"learn = tabular_learner(data, layers=[200,100], metrics=accuracy)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {
	"scrolled": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"TabularModel(\n",
	" (embeds): ModuleList(\n",
	" (0): Embedding(12442, 50)\n",
	" (1): Embedding(60, 31)\n",
	" (2): Embedding(113, 50)\n",
	" (3): Embedding(23, 12)\n",
	" (4): Embedding(13, 7)\n",
	" (5): Embedding(32, 17)\n",
	" )\n",
	" (emb_drop): Dropout(p=0.0)\n",
	" (bn_cont): BatchNorm1d(0, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
	" (layers): Sequential(\n",
	" (0): Linear(in_features=167, out_features=200, bias=True)\n",
	" (1): ReLU(inplace)\n",
	" (2): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
	" (3): Linear(in_features=200, out_features=100, bias=True)\n",
	" (4): ReLU(inplace)\n",
	" (5): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
	" (6): Linear(in_features=100, out_features=2, bias=True)\n",
	" )\n",
	")"
	]
	},
	"execution_count": 13,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"learn.model"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {
	"scrolled": true
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Total time: 00:02\n",
	"epoch train_loss valid_loss accuracy\n",
	"1 0.530561 0.692435 0.600000 (00:02)\n",
	"\n"
	]
	}
	],
	"source": [
	"learn.fit(1, 1e-2)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Inference."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"score_phrase Amazing\n",
	"title LittleBigPlanet PS Vita -- Marvel Super Hero E...\n",
	"url /games/littlebigplanet-ps-vita-marvel-super-he...\n",
	"platform PlayStation Vita\n",
	"score 9\n",
	"one_hot_score 1\n",
	"genre Platformer\n",
	"editors_choice Y\n",
	"release_year 2012\n",
	"release_month 9\n",
	"release_day 12\n",
	"Name: 1, dtype: object"
	]
	},
	"execution_count": 15,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"row = df.iloc[1]\n",
	"row"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(1, tensor(0), tensor([0.9383, 0.0617]))"
	]
	},
	"execution_count": 16,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"learn.predict(row)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.7.0"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}