remorsecs/PyTorch 入門最速傳說.ipynb

## PyTorch 入門最速傳說.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PyTorch 入門最速傳說"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Why is PyTorch?\n",
    "\n",
    "![](https://i.imgflip.com/27c19g.jpg)\n",
    "\n",
    "優點：\n",
    "- 動態計算圖 Dynamic Computational Graphs [1]\n",
    "- 設計優良的 API\n",
    "- 豐富的文件\n",
    "- 社群發達，更新速度快（每三個月釋出一次正式版）\n",
    "\n",
    "參考：\n",
    "- [1] [Deep Learning with Dynamic Computation Graphs](https://arxiv.org/pdf/1702.02181.pdf)\n",
    "\n",
    "相關討論：\n",
    "- [PyTorch or TensorFlow?](https://www.kdnuggets.com/2017/08/pytorch-tensorflow.html) (2017.08) \n",
    "- [PyTorch, Dynamic Computational Graphs and Modular Deep Learning](https://medium.com/intuitionmachine/pytorch-dynamic-computational-graphs-and-modular-deep-learning-7e7f89f18d1) (2017.01)\n",
    "- https://www.google.com/search?q=pytorch+vs+tensorflow\n",
    "\n",
    "\n",
    "## Installation\n",
    "\n",
    "建議使用 Conda 套件管理工具。\n",
    "\n",
    "- Anaconda: https://anaconda.org/\n",
    "- Miniconda: https://conda.io/miniconda.html\n",
    "\n",
    "### Dependencies\n",
    "\n",
    "官方建議使用 CUDA 7.5 以上 / cuDNN v6.x 以上。<br>\n",
    "我建議使用 CUDA 9.0 / cuDNN v7.x。<br>\n",
    "- CUDA: https://developer.nvidia.com/cuda-90-download-archive\n",
    "- cuDNN: https://developer.nvidia.com/cudnn\n",
    "\n",
    "### Binaries\n",
    "\n",
    "**On Linux**\n",
    "\n",
    "參考：http://pytorch.org/\n",
    "\n",
    "選擇 Python, CUDA 版本以及使用套件管理工具。<br>\n",
    "選好以後直接複製指令即可安裝。\n",
    "\n",
    "範例\n",
    "\n",
    "```Bash\n",
    "conda create -n pytorch python=3.6\n",
    "conda install pytorch torchvision cuda90 -c pytorch\n",
    "```\n",
    "\n",
    "**On Windows**\n",
    "\n",
    "官方沒有提供給 Windows 用戶的 binary files，通常必須載 source code 自己編。<br>\n",
    "可以從 [peterjc123](https://github.com/peterjc123/) 下載他編好的。\n",
    "\n",
    "參考：https://github.com/peterjc123/pytorch-scripts#easy-installation\n",
    "\n",
    "範例\n",
    "\n",
    "```PowerShell\n",
    "conda create -n pytorch python=3.6\n",
    "conda install -c peterjc123 pytorch cuda90\n",
    "```\n",
    "\n",
    "注意他沒有在 conda 提供 torchvision，可直接用 pip 或自己另外找 wheel 檔安裝。<br>\n",
    "\n",
    "參考：https://pypi.python.org/pypi/torchvision\n",
    "\n",
    "範例\n",
    "\n",
    "```PowerShell\n",
    "pip install torchvision\n",
    "```\n",
    "\n",
    "### From Source\n",
    "\n",
    "自行參考：https://github.com/pytorch/pytorch#from-source\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "## Tutorial\n",
    "\n",
    "官網: http://pytorch.org/tutorials/index.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tensor\n",
    "\n",
    "Tensor: 張量\n",
    "\n",
    "- 0th-order tensor: scalar <br>\n",
    "- 1st-order tensor: Euclidean vector (vector) <br>\n",
    "ex. 2D-velocity <x, y>, shape = (2)\n",
    "- 2nd-order tensor: matrix <br>\n",
    "ex. gray-scale image (224x224), shape = (224, 224)\n",
    "- 3rd-order tensor <br>\n",
    "ex. color image (3x224x224), shape = (3, 224, 224)\n",
    "- 4th-order tensor <br>\n",
    "ex. 32-batch-size color image (32x3x224x224), shape = (32, 3, 224, 224)\n",
    "\n",
    "**Note**\n",
    "\n",
    "需要注意 PyTorch 使用 `torch.nn.Conv2d` 時， <br>\n",
    "輸入圖片的 tensor 擺放必須為 `(nSamples, nChannels, Height, Width)`\n",
    "\n",
    "關於 Tensor 的定義，請參考：https://en.wikipedia.org/wiki/Tensor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.Tensor\n",
    "\n",
    "參考：http://pytorch.org/docs/master/tensors.html\n",
    "\n",
    "一種類似 `numpy.ndarray` 的資料結構，可以進行幾乎等同於 `numpy.ndarray` 的運算。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: \n",
      "1.00000e-26 *\n",
      "  3.3605  0.0000  2.1737\n",
      "  0.0000  0.0000  0.0000\n",
      "  0.0000  0.0000  3.3609\n",
      "  0.0000  2.1737  0.0000\n",
      "  0.0000  0.0000  0.0000\n",
      "[torch.FloatTensor of size 5x3]\n",
      "\n",
      "y: [[3.36046857e-26 9.20653091e-43 2.17367911e-26]\n",
      " [9.20653091e-43 0.00000000e+00 0.00000000e+00]\n",
      " [2.10194770e-44 0.00000000e+00 3.36086300e-26]\n",
      " [9.20653091e-43 2.17367911e-26 9.20653091e-43]\n",
      " [0.00000000e+00 0.00000000e+00 2.10194770e-44]]\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import numpy as np\n",
    "\n",
    "x = torch.Tensor(5, 3)\n",
    "print(f'x: {x}')\n",
    "y = np.ndarray((5, 3))\n",
    "print(f'y: {y}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "可以使用 Python 原生的數學運算子。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "y: \n",
      " 10\n",
      " 20\n",
      " 30\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "z = x + y: \n",
      " 11\n",
      " 22\n",
      " 33\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "z = x.add(y): \n",
      " 11\n",
      " 22\n",
      " 33\n",
      "[torch.FloatTensor of size 3]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "x = torch.Tensor([1, 2, 3])\n",
    "print(f'x: {x}')\n",
    "\n",
    "y = torch.Tensor([10, 20, 30])\n",
    "print(f'y: {y}')\n",
    "\n",
    "z = x + y\n",
    "print(f'z = x + y: {z}')\n",
    "\n",
    "z = x.add(y)\n",
    "print(f'z = x.add(y): {z}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "所有運算子後面加個底線 _ 的都是 **in-place operation**。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "z = x.add(y): \n",
      " 11\n",
      " 22\n",
      " 33\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "x (x.add(y)): \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "x (x.add_(y)): \n",
      " 11\n",
      " 22\n",
      " 33\n",
      "[torch.FloatTensor of size 3]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "x = torch.Tensor([1, 2, 3])\n",
    "y = torch.Tensor([10, 20, 30])\n",
    "\n",
    "z = x.add(y)\n",
    "\n",
    "print(f'z = x.add(y): {z}')\n",
    "print(f'x (x.add(y)): {x}')\n",
    "\n",
    "x.add_(y)  # in-place operation\n",
    "\n",
    "print(f'x (x.add_(y)): {x}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "\n",
    "### torch.Tensor <=> numpy.ndarray\n",
    "\n",
    "`torch.Tensor` 和 `numpy.ndarray` 兩個物件可以互相轉換。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "y: [1. 2. 3.]\n",
      "\n",
      "z: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "x = torch.Tensor([1, 2, 3])\n",
    "print(f'x: {x}')\n",
    "\n",
    "# torch.Tensor => NumPy.ndarray\n",
    "y = x.numpy()\n",
    "print(f'y: {y}\\n')\n",
    "\n",
    "# torch.Tensor <= NumPy.ndarray\n",
    "z = torch.from_numpy(y)\n",
    "print(f'z: {z}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "### torch.Tensor <=> torch.cuda.Tensor\n",
    "對 `Tensor` 物件使用 `.cuda()` 即可置入 GPU 進行運算。<br>\n",
    "如果有多個 GPU，可以加入參數來指定要使用哪一個。\n",
    "\n",
    "`torch.cuda.Tensor` 可以使用 `.cpu()` 變回 `torch.Tensor` 物件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n",
      "y: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.cuda.FloatTensor of size 3 (GPU 0)]\n",
      "\n",
      "z: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.cuda.FloatTensor of size 3 (GPU 1)]\n",
      "\n",
      "z: \n",
      " 1\n",
      " 2\n",
      " 3\n",
      "[torch.FloatTensor of size 3]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "x = torch.Tensor([1, 2, 3])\n",
    "print(f'x: {x}')\n",
    "\n",
    "y = x.cuda()\n",
    "print(f'y: {y}')\n",
    "\n",
    "z = y.cuda(1)\n",
    "print(f'z: {z}')\n",
    "\n",
    "z = z.cpu()\n",
    "print(f'z: {z}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.autograd.Variable\n",
    "\n",
    "參考：http://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html\n",
    "\n",
    "現在我們知道如何使用 `Tensor` 儲存資料了。<br>\n",
    "接下來的問題是：如何對變數求導、進一步實現 backpropagation？\n",
    "\n",
    "PyTorch 提供了 `torch.autograd.Variable` package，可以將 `Tensor` 物件存入 `Variable` 內，<br>\n",
    "並且設定參數 `requires_grad=True` 後再使用 `.backward()` 即可計算梯度。\n",
    "\n",
    "將 `Tensor` 存入 `Variable` 以後，<br>\n",
    "在 `.data` 內可以得到你存入的 `Tensor` 物件，在你使用 `.backward()` 以後會將梯度存入 `.grad`。<br>\n",
    "\n",
    "![](http://pytorch.org/tutorials/_images/Variable.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "#### Example.\n",
    "\n",
    "$$\n",
    "x = 10\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "y = (2x + 2)^2\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x:\n",
      "Variable containing:\n",
      " 10\n",
      "[torch.FloatTensor of size 1]\n",
      "\n",
      "y:\n",
      "Variable containing:\n",
      " 484\n",
      "[torch.FloatTensor of size 1]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from torch.autograd import Variable\n",
    "\n",
    "x = torch.Tensor([10])\n",
    "x = Variable(x, requires_grad=True)\n",
    "print(f'x:\\n{x}')\n",
    "\n",
    "y = (2*x + 2)**2\n",
    "print(f'y:\\n{y}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "$$\n",
    "\\frac {dy} {dx} = 8x+8\\bigr\\rvert_{x=10} = 88\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x.grad:\n",
      "Variable containing:\n",
      " 88\n",
      "[torch.FloatTensor of size 1]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "y.backward()\n",
    "print(f'x.grad:\\n{x.grad}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Neural Network\n",
    "\n",
    "參考：http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html\n",
    "\n",
    "訓練神經網路的過程大致如下：\n",
    "1. 使用 `torch.nn` 定義神經網路\n",
    "1. 使用 `torch.optim` 定義 optimizer，決定用何種策略來最佳化你的參數\n",
    "1. 將資料讀入、並計算 loss\n",
    "1. 計算梯度，並且用 backpropagation 修正參數"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以下使用官網提供的例子：\n",
    "\n",
    "#### 1. 定義神經網路\n",
    "\n",
    "一般來說，會讓神經網路繼承 `nn.Module`，繼承此類別必須 implement `forward()`。\n",
    "- `__init__()` 定義你的神經網路有哪些 layers。\n",
    "- `forward()` 決定你的神經網路在輸入 input 以後，資料會如何經過你的 layers 進行運算，最後輸出結果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Net(\n",
      "  (conv1): Conv2d (1, 6, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (conv2): Conv2d (6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (fc1): Linear(in_features=400, out_features=120)\n",
      "  (fc2): Linear(in_features=120, out_features=84)\n",
      "  (fc3): Linear(in_features=84, out_features=10)\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torch.autograd import Variable\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "\n",
    "\n",
    "class Net(nn.Module):\n",
    "\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        # 1 input image channel, 6 output channels, 5x5 square convolution\n",
    "        # kernel\n",
    "        self.conv1 = nn.Conv2d(1, 6, 5)\n",
    "        self.conv2 = nn.Conv2d(6, 16, 5)\n",
    "        # an affine operation: y = Wx + b\n",
    "        self.fc1 = nn.Linear(16 * 5 * 5, 120)\n",
    "        self.fc2 = nn.Linear(120, 84)\n",
    "        self.fc3 = nn.Linear(84, 10)\n",
    "\n",
    "    def forward(self, x):\n",
    "        # Max pooling over a (2, 2) window\n",
    "        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
    "        # If the size is a square you can only specify a single number\n",
    "        x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
    "        x = x.view(-1, self.num_flat_features(x))\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = F.relu(self.fc2(x))\n",
    "        x = self.fc3(x)\n",
    "        return x\n",
    "\n",
    "    def num_flat_features(self, x):\n",
    "        size = x.size()[1:]  # all dimensions except the batch dimension\n",
    "        num_features = 1\n",
    "        for s in size:\n",
    "            num_features *= s\n",
    "        return num_features\n",
    "\n",
    "\n",
    "net = Net()\n",
    "print(net)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "在你正確定義好 `forward()` 以後，input 讀進你的神經網路即可計算出結果。 <br>\n",
    "後面使用 `backward()` 也會幫你計算所有參數的梯度。\n",
    "\n",
    "Input `Tensor` 需要先轉型成 `Variable` 物件後才可以讀入神經網路，其 output 也是 `Variable` 物件。 <br>\n",
    "以下是一個 forward 範例："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Variable containing:\n",
      " 0.0144 -0.0174  0.0186 -0.1524  0.1014  0.0666 -0.0254  0.0241 -0.0485 -0.0240\n",
      "[torch.FloatTensor of size 1x10]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "input = Variable(torch.randn(1, 1, 32, 32))\n",
    "out = net(input)\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "#### 2. 定義 optimizer\n",
    "\n",
    "首先從 `torch.optim` 尋找你想要用的最佳化演算法， <br>\n",
    "給 `optim` 的參數必須要提供 `iterable` 的物件，且所有物件皆為 `Variable`。<br>\n",
    "\n",
    "以上面的神經網路為例，可以直接使用 `net.parameters()` 來當作 `optim` 的參數， <br>\n",
    "讓你的 optimizer 知道該最佳化的對象是你的神經網路的參數。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch.optim as optim\n",
    "\n",
    "optimizer = optim.SGD(net.parameters(), lr=0.01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3. 將資料讀入並計算 loss\n",
    "\n",
    "**Note** <br>\n",
    "讀入資料的方式，通常會用 `PIL.Image` 讀入圖片以後轉型成 `Tensor`。 <br>\n",
    "但如果你要讀入的圖片只有一張時，你的圖片會是 3rd-order tensor, <br>\n",
    "ex. shape: (3, 224, 224) <br>\n",
    "這時候可以用 `Tensor.unsqueeze(0)` 方法強行擴展為 shape: (1, 3, 224, 224) <br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "loss function 可以從 `torch.nn` 內挑一個自己喜歡的來用。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "criterion = nn.MSELoss()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4. 計算梯度\n",
    "\n",
    "首先，用 `optimizer.zero_grad()` 把你的神經網路內的參數 `.grad` 都歸零。 <br>\n",
    "接下來，用 `loss.backward()` 計算梯度並且傳遞給神經網路的所有參數。 <br>\n",
    "最後，用 `optimizer.step()` 完成參數更新。 <br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# in your training loop:\n",
    "optimizer.zero_grad()   # zero the gradient buffers\n",
    "output = net(input)\n",
    "loss = criterion(output, target)\n",
    "\n",
    "loss.backward()\n",
    "optimizer.step()    # Does the update"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在官網的範例程式碼，註解上面寫了 `# in your training loop:` <br>\n",
    "亦即，這一部分通常會放在一個 for-loop 內， <br>\n",
    "不斷讀入 training data，然後求梯度、更新參數、最後等到 loss 收斂以後就大功告成啦。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data Loading and Processing\n",
    "\n",
    "以 Kaggle [Plant Seedlings Classification](https://www.kaggle.com/c/plant-seedlings-classification) 競賽為例。 <br>\n",
    "這個競賽是 multi-class classification 問題，dataset 目錄結構如下：\n",
    "\n",
    "```\n",
    "/plant-seedlings-classification/\n",
    "  |- train/<category>/*.png\n",
    "  |- test/*.png\n",
    "  |- sample_submission.csv\n",
    "```\n",
    "\n",
    "因此，在 `train/` 資料夾用 `Path.glob('*')` 掃過一遍可以得到所有類別和 index。 <br>\n",
    "之後再從各個類別內用 `Path.glob('*.png')` 取得所有圖片，根據資料夾名稱可以得到 label。 <br>\n",
    "\n",
    "若只用基本 Python 標準庫和語法，相信大家都知道如何撰寫一支程式，讀取 dataset 以後建立 training data 來訓練模型。 <br>\n",
    "但還是有一些麻煩之處：\n",
    "1. 如何根據 batch size 取出資料？\n",
    "1. 如何用 multiprocessing 進行運算？\n",
    "1. 如何在資料讀入以後順便處理？\n",
    "\n",
    "所幸 PyTorch 提供 `torch.utils.data.Dataset` 和 `torch.utils.data.DataLoader`， <br>\n",
    "以及 torchvision 提供很多高階函式庫，讓處理資料變得更為容易。\n",
    "\n",
    "參考：\n",
    "- http://pytorch.org/tutorials/beginner/data_loading_tutorial.html\n",
    "- http://pytorch.org/docs/0.3.0/data.html\n",
    "- http://pytorch.org/docs/0.3.0/torchvision/index.html\n",
    "\n",
    "以下是針對 Plant Seedlings Classification 競賽的 Dataset class："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/dataset.py\n",
    "from torch.utils.data import Dataset\n",
    "from pathlib import Path\n",
    "from PIL import Image\n",
    "\n",
    "\n",
    "class PlantSeedlingDataset(Dataset):\n",
    "    def __init__(self, root_dir, transforms=None):\n",
    "        self.root_dir = Path(root_dir)\n",
    "        self.x = []\n",
    "        self.y = []\n",
    "        self.transform = transform\n",
    "        self.num_classes = 0\n",
    "\n",
    "        if self.root_dir.name == 'train':\n",
    "            for i, _dir in enumerate(self.root_dir.glob('*')):\n",
    "                for file in _dir.glob('*'):\n",
    "                    self.x.append(file)\n",
    "                    self.y.append(i)\n",
    "\n",
    "                self.num_classes += 1\n",
    "\n",
    "    def __len__(self):\n",
    "        return len(self.x)\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        image = Image.open(self.x[index]).convert('RGB')\n",
    "\n",
    "        if self.transform:\n",
    "            image = self.transforms(image)\n",
    "\n",
    "        return image, self.y[index]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我們建立一個類別，繼承自 `torch.utils.data.Dataset` 這個抽象類別，其中要實作：\n",
    "1. `__len__`: 取得 dataset 大小。\n",
    "1. `__getitem__`: 取得 data。輸出兩欄分別是 input data 和 label。 <br>\n",
    "其中，input data 可接受的類別只有 `torch.Tensor` 或 `PIL.Image`。\n",
    "在實作 `__getitem__` 中也可以配合 `transfroms` 來讓資料讀入的時候經過各種轉換和處理後再存起來。\n",
    "\n",
    "我們實作完成 `Dataset` 類別物件以後，就可以配合使用 `DataLoader`。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/train.py\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "from models import VGG16\n",
    "from dataset import PlantSeedlingDataset\n",
    "from utils import parse_args\n",
    "from torch.autograd import Variable\n",
    "from torch.utils.data import DataLoader\n",
    "from torchvision import transforms\n",
    "from pathlib import Path\n",
    "import copy\n",
    "\n",
    "# ... 中略\n",
    "\n",
    "def train():\n",
    "    data_transform = transforms.Compose([\n",
    "        transforms.RandomResizedCrop(224),\n",
    "        transforms.ToTensor(),\n",
    "        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n",
    "    ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在上面使用 `torchvision.transforms` 這個物件來定義資料讀入的過程會經過哪些轉換。 <br>\n",
    "最猛的是 `transforms.Compose` 可以把這些轉換用 list 接起來以後，我們可以很方便的對圖片進行一連串的轉換操作。 <br>\n",
    "\n",
    "從上面可以看到，資料讀入之後會：\n",
    "1. `RandomResizedCrop(224)`: 隨機在畫面上 crop 出 224x224 的圖片。\n",
    "1. `ToTensor()`: 將 PIL.Image 物件轉換成 torch.Tensor 物件。\n",
    "1. `Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])`: 對圖片做 normalize (3 channels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "    train_set = PlantSeedlingDataset(Path(DATASET_ROOT).joinpath('train'), data_transform)\n",
    "    data_loader = DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)\n",
    "```\n",
    "方便我們決定使用多大的 batch size、是否對 dataset 做 shuffle、要用多少 process 進行運算。 <br>\n",
    "剩下的就是訓練過程完整的 code。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "    model = VGG16(num_classes=train_set.num_classes)\n",
    "    model = model.cuda(CUDA_DEVICES)\n",
    "    model.train()\n",
    "\n",
    "    best_model_params = copy.deepcopy(model.state_dict())\n",
    "    best_acc = 0.0\n",
    "    num_epochs = 50\n",
    "    criterion = nn.CrossEntropyLoss()\n",
    "    optimizer = torch.optim.SGD(params=model.parameters(), lr=0.001, momentum=0.9)\n",
    "\n",
    "    for epoch in range(num_epochs):\n",
    "        print(f'Epoch: {epoch + 1}/{num_epochs}')\n",
    "        print('-' * len(f'Epoch: {epoch + 1}/{num_epochs}'))\n",
    "\n",
    "        training_loss = 0.0\n",
    "        training_corrects = 0\n",
    "\n",
    "        for i, (inputs, labels) in enumerate(data_loader):\n",
    "            inputs = Variable(inputs.cuda(CUDA_DEVICES))\n",
    "            labels = Variable(labels.cuda(CUDA_DEVICES))\n",
    "\n",
    "            optimizer.zero_grad()\n",
    "\n",
    "            outputs = model(inputs)\n",
    "            _, preds = torch.max(outputs.data, 1)\n",
    "            loss = criterion(outputs, labels)\n",
    "\n",
    "            loss.backward()\n",
    "            optimizer.step()\n",
    "\n",
    "            training_loss += loss.data[0] * inputs.size(0)\n",
    "            training_corrects += torch.sum(preds == labels.data)\n",
    "\n",
    "        training_loss = training_loss / len(train_set)\n",
    "        training_acc = training_corrects / len(train_set)\n",
    "\n",
    "        print(f'Training loss: {training_loss:.4f}\\taccuracy: {training_acc:.4f}\\n')\n",
    "\n",
    "        if training_acc > best_acc:\n",
    "            best_acc = training_acc\n",
    "            best_model_params = copy.deepcopy(model.state_dict())\n",
    "\n",
    "    model.load_state_dict(best_model_params)\n",
    "    torch.save(model, f'model-{best_acc:.02f}-best_train_acc.pth')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Save & Load Model (Parameters)\n",
    "\n",
    "參考：http://pytorch.org/docs/master/notes/serialization.html#recommend-saving-models\n",
    "\n",
    "兩種方式：\n",
    "1. SL 參數 (官方建議使用) <br>\n",
    "  - Save\n",
    "```\n",
    "torch.save(the_model.state_dict(), PATH)\n",
    "```\n",
    "  - Load\n",
    "```\n",
    "the_model = TheModelClass(*args, **kwargs)\n",
    "the_model.load_state_dict(torch.load(PATH))\n",
    "```\n",
    "2. SL 整個 model (經改動過就無法使用) <br>\n",
    "  - Save\n",
    "```\n",
    "torch.save(the_model, PATH)\n",
    "```\n",
    "  - Load\n",
    "```\n",
    "the_model = torch.load(PATH)\n",
    "```\n",
    "\n",
    "model 慣用副檔名是 .pth，不過開發人員表示不管用什麼副檔名都不會有任何影響。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "經過以上簡單的介紹，大家應該對 PyTorch 已經有基本了解了。 <br>\n",
    "接下來的問題是，如何成為強者、把 PyTorch 使用地更為熟練：\n",
    "1. 動手實作 <br>\n",
    "這是最重要的。沒有親手實作過是不可能會寫的。\n",
    "1. 讀官方教學 <br>\n",
    "http://pytorch.org/tutorials/index.html\n",
    "1. 讀官方文件 <br>\n",
    "http://pytorch.org/docs/master/index.html\n",
    "1. 讀 source code <br>\n",
    "https://github.com/pytorch/pytorch\n",
    "1. 逛官方論壇 <br>\n",
    "https://discuss.pytorch.org/\n",
    "1. Ptt DataScience 板\n",
    "\n",
    "\n",
    "最後\n",
    "\n",
    "> 強者之路是寂寞的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "本文會放在 gist 上面更新，有任何疑問歡迎在底下發 comment。<br>\n",
    "https://gist.github.com/remorsecs/959b2e9ce39712366cea676426a34945 <br>\n",
    "Kaggle 比賽的 GitHub repo: <br>\n",
    "https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}