Last active
March 12, 2024 11:41
-
-
Save remorsecs/959b2e9ce39712366cea676426a34945 to your computer and use it in GitHub Desktop.
本文目標是讓沒寫過 PyTorch 的讀者成為 PyTorch 新手 (?)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# PyTorch 入門最速傳說" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Why is PyTorch?\n", | |
"\n", | |
"![](https://i.imgflip.com/27c19g.jpg)\n", | |
"\n", | |
"優點:\n", | |
"- 動態計算圖 Dynamic Computational Graphs [1]\n", | |
"- 設計優良的 API\n", | |
"- 豐富的文件\n", | |
"- 社群發達,更新速度快(每三個月釋出一次正式版)\n", | |
"\n", | |
"參考:\n", | |
"- [1] [Deep Learning with Dynamic Computation Graphs](https://arxiv.org/pdf/1702.02181.pdf)\n", | |
"\n", | |
"相關討論:\n", | |
"- [PyTorch or TensorFlow?](https://www.kdnuggets.com/2017/08/pytorch-tensorflow.html) (2017.08) \n", | |
"- [PyTorch, Dynamic Computational Graphs and Modular Deep Learning](https://medium.com/intuitionmachine/pytorch-dynamic-computational-graphs-and-modular-deep-learning-7e7f89f18d1) (2017.01)\n", | |
"- https://www.google.com/search?q=pytorch+vs+tensorflow\n", | |
"\n", | |
"\n", | |
"## Installation\n", | |
"\n", | |
"建議使用 Conda 套件管理工具。\n", | |
"\n", | |
"- Anaconda: https://anaconda.org/\n", | |
"- Miniconda: https://conda.io/miniconda.html\n", | |
"\n", | |
"### Dependencies\n", | |
"\n", | |
"官方建議使用 CUDA 7.5 以上 / cuDNN v6.x 以上。<br>\n", | |
"我建議使用 CUDA 9.0 / cuDNN v7.x。<br>\n", | |
"- CUDA: https://developer.nvidia.com/cuda-90-download-archive\n", | |
"- cuDNN: https://developer.nvidia.com/cudnn\n", | |
"\n", | |
"### Binaries\n", | |
"\n", | |
"**On Linux**\n", | |
"\n", | |
"參考:http://pytorch.org/\n", | |
"\n", | |
"選擇 Python, CUDA 版本以及使用套件管理工具。<br>\n", | |
"選好以後直接複製指令即可安裝。\n", | |
"\n", | |
"範例\n", | |
"\n", | |
"```Bash\n", | |
"conda create -n pytorch python=3.6\n", | |
"conda install pytorch torchvision cuda90 -c pytorch\n", | |
"```\n", | |
"\n", | |
"**On Windows**\n", | |
"\n", | |
"官方沒有提供給 Windows 用戶的 binary files,通常必須載 source code 自己編。<br>\n", | |
"可以從 [peterjc123](https://github.com/peterjc123/) 下載他編好的。\n", | |
"\n", | |
"參考:https://github.com/peterjc123/pytorch-scripts#easy-installation\n", | |
"\n", | |
"範例\n", | |
"\n", | |
"```PowerShell\n", | |
"conda create -n pytorch python=3.6\n", | |
"conda install -c peterjc123 pytorch cuda90\n", | |
"```\n", | |
"\n", | |
"注意他沒有在 conda 提供 torchvision,可直接用 pip 或自己另外找 wheel 檔安裝。<br>\n", | |
"\n", | |
"參考:https://pypi.python.org/pypi/torchvision\n", | |
"\n", | |
"範例\n", | |
"\n", | |
"```PowerShell\n", | |
"pip install torchvision\n", | |
"```\n", | |
"\n", | |
"### From Source\n", | |
"\n", | |
"自行參考:https://github.com/pytorch/pytorch#from-source\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"## Tutorial\n", | |
"\n", | |
"官網: http://pytorch.org/tutorials/index.html" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Tensor\n", | |
"\n", | |
"Tensor: 張量\n", | |
"\n", | |
"- 0th-order tensor: scalar <br>\n", | |
"- 1st-order tensor: Euclidean vector (vector) <br>\n", | |
"ex. 2D-velocity <x, y>, shape = (2)\n", | |
"- 2nd-order tensor: matrix <br>\n", | |
"ex. gray-scale image (224x224), shape = (224, 224)\n", | |
"- 3rd-order tensor <br>\n", | |
"ex. color image (3x224x224), shape = (3, 224, 224)\n", | |
"- 4th-order tensor <br>\n", | |
"ex. 32-batch-size color image (32x3x224x224), shape = (32, 3, 224, 224)\n", | |
"\n", | |
"**Note**\n", | |
"\n", | |
"需要注意 PyTorch 使用 `torch.nn.Conv2d` 時, <br>\n", | |
"輸入圖片的 tensor 擺放必須為 `(nSamples, nChannels, Height, Width)`\n", | |
"\n", | |
"關於 Tensor 的定義,請參考:https://en.wikipedia.org/wiki/Tensor" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### torch.Tensor\n", | |
"\n", | |
"參考:http://pytorch.org/docs/master/tensors.html\n", | |
"\n", | |
"一種類似 `numpy.ndarray` 的資料結構,可以進行幾乎等同於 `numpy.ndarray` 的運算。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x: \n", | |
"1.00000e-26 *\n", | |
" 3.3605 0.0000 2.1737\n", | |
" 0.0000 0.0000 0.0000\n", | |
" 0.0000 0.0000 3.3609\n", | |
" 0.0000 2.1737 0.0000\n", | |
" 0.0000 0.0000 0.0000\n", | |
"[torch.FloatTensor of size 5x3]\n", | |
"\n", | |
"y: [[3.36046857e-26 9.20653091e-43 2.17367911e-26]\n", | |
" [9.20653091e-43 0.00000000e+00 0.00000000e+00]\n", | |
" [2.10194770e-44 0.00000000e+00 3.36086300e-26]\n", | |
" [9.20653091e-43 2.17367911e-26 9.20653091e-43]\n", | |
" [0.00000000e+00 0.00000000e+00 2.10194770e-44]]\n" | |
] | |
} | |
], | |
"source": [ | |
"import torch\n", | |
"import numpy as np\n", | |
"\n", | |
"x = torch.Tensor(5, 3)\n", | |
"print(f'x: {x}')\n", | |
"y = np.ndarray((5, 3))\n", | |
"print(f'y: {y}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"可以使用 Python 原生的數學運算子。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"y: \n", | |
" 10\n", | |
" 20\n", | |
" 30\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"z = x + y: \n", | |
" 11\n", | |
" 22\n", | |
" 33\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"z = x.add(y): \n", | |
" 11\n", | |
" 22\n", | |
" 33\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"x = torch.Tensor([1, 2, 3])\n", | |
"print(f'x: {x}')\n", | |
"\n", | |
"y = torch.Tensor([10, 20, 30])\n", | |
"print(f'y: {y}')\n", | |
"\n", | |
"z = x + y\n", | |
"print(f'z = x + y: {z}')\n", | |
"\n", | |
"z = x.add(y)\n", | |
"print(f'z = x.add(y): {z}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"所有運算子後面加個底線 _ 的都是 **in-place operation**。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"z = x.add(y): \n", | |
" 11\n", | |
" 22\n", | |
" 33\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"x (x.add(y)): \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"x (x.add_(y)): \n", | |
" 11\n", | |
" 22\n", | |
" 33\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"x = torch.Tensor([1, 2, 3])\n", | |
"y = torch.Tensor([10, 20, 30])\n", | |
"\n", | |
"z = x.add(y)\n", | |
"\n", | |
"print(f'z = x.add(y): {z}')\n", | |
"print(f'x (x.add(y)): {x}')\n", | |
"\n", | |
"x.add_(y) # in-place operation\n", | |
"\n", | |
"print(f'x (x.add_(y)): {x}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"\n", | |
"### torch.Tensor <=> numpy.ndarray\n", | |
"\n", | |
"`torch.Tensor` 和 `numpy.ndarray` 兩個物件可以互相轉換。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"y: [1. 2. 3.]\n", | |
"\n", | |
"z: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"x = torch.Tensor([1, 2, 3])\n", | |
"print(f'x: {x}')\n", | |
"\n", | |
"# torch.Tensor => NumPy.ndarray\n", | |
"y = x.numpy()\n", | |
"print(f'y: {y}\\n')\n", | |
"\n", | |
"# torch.Tensor <= NumPy.ndarray\n", | |
"z = torch.from_numpy(y)\n", | |
"print(f'z: {z}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"### torch.Tensor <=> torch.cuda.Tensor\n", | |
"對 `Tensor` 物件使用 `.cuda()` 即可置入 GPU 進行運算。<br>\n", | |
"如果有多個 GPU,可以加入參數來指定要使用哪一個。\n", | |
"\n", | |
"`torch.cuda.Tensor` 可以使用 `.cpu()` 變回 `torch.Tensor` 物件。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n", | |
"y: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.cuda.FloatTensor of size 3 (GPU 0)]\n", | |
"\n", | |
"z: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.cuda.FloatTensor of size 3 (GPU 1)]\n", | |
"\n", | |
"z: \n", | |
" 1\n", | |
" 2\n", | |
" 3\n", | |
"[torch.FloatTensor of size 3]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"x = torch.Tensor([1, 2, 3])\n", | |
"print(f'x: {x}')\n", | |
"\n", | |
"y = x.cuda()\n", | |
"print(f'y: {y}')\n", | |
"\n", | |
"z = y.cuda(1)\n", | |
"print(f'z: {z}')\n", | |
"\n", | |
"z = z.cpu()\n", | |
"print(f'z: {z}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### torch.autograd.Variable\n", | |
"\n", | |
"參考:http://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html\n", | |
"\n", | |
"現在我們知道如何使用 `Tensor` 儲存資料了。<br>\n", | |
"接下來的問題是:如何對變數求導、進一步實現 backpropagation?\n", | |
"\n", | |
"PyTorch 提供了 `torch.autograd.Variable` package,可以將 `Tensor` 物件存入 `Variable` 內,<br>\n", | |
"並且設定參數 `requires_grad=True` 後再使用 `.backward()` 即可計算梯度。\n", | |
"\n", | |
"將 `Tensor` 存入 `Variable` 以後,<br>\n", | |
"在 `.data` 內可以得到你存入的 `Tensor` 物件,在你使用 `.backward()` 以後會將梯度存入 `.grad`。<br>\n", | |
"\n", | |
"![](http://pytorch.org/tutorials/_images/Variable.png)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"#### Example.\n", | |
"\n", | |
"$$\n", | |
"x = 10\n", | |
"$$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"$$\n", | |
"y = (2x + 2)^2\n", | |
"$$" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x:\n", | |
"Variable containing:\n", | |
" 10\n", | |
"[torch.FloatTensor of size 1]\n", | |
"\n", | |
"y:\n", | |
"Variable containing:\n", | |
" 484\n", | |
"[torch.FloatTensor of size 1]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"from torch.autograd import Variable\n", | |
"\n", | |
"x = torch.Tensor([10])\n", | |
"x = Variable(x, requires_grad=True)\n", | |
"print(f'x:\\n{x}')\n", | |
"\n", | |
"y = (2*x + 2)**2\n", | |
"print(f'y:\\n{y}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"$$\n", | |
"\\frac {dy} {dx} = 8x+8\\bigr\\rvert_{x=10} = 88\n", | |
"$$" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x.grad:\n", | |
"Variable containing:\n", | |
" 88\n", | |
"[torch.FloatTensor of size 1]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"y.backward()\n", | |
"print(f'x.grad:\\n{x.grad}')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Neural Network\n", | |
"\n", | |
"參考:http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html\n", | |
"\n", | |
"訓練神經網路的過程大致如下:\n", | |
"1. 使用 `torch.nn` 定義神經網路\n", | |
"1. 使用 `torch.optim` 定義 optimizer,決定用何種策略來最佳化你的參數\n", | |
"1. 將資料讀入、並計算 loss\n", | |
"1. 計算梯度,並且用 backpropagation 修正參數" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"以下使用官網提供的例子:\n", | |
"\n", | |
"#### 1. 定義神經網路\n", | |
"\n", | |
"一般來說,會讓神經網路繼承 `nn.Module`,繼承此類別必須 implement `forward()`。\n", | |
"- `__init__()` 定義你的神經網路有哪些 layers。\n", | |
"- `forward()` 決定你的神經網路在輸入 input 以後,資料會如何經過你的 layers 進行運算,最後輸出結果。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Net(\n", | |
" (conv1): Conv2d (1, 6, kernel_size=(5, 5), stride=(1, 1))\n", | |
" (conv2): Conv2d (6, 16, kernel_size=(5, 5), stride=(1, 1))\n", | |
" (fc1): Linear(in_features=400, out_features=120)\n", | |
" (fc2): Linear(in_features=120, out_features=84)\n", | |
" (fc3): Linear(in_features=84, out_features=10)\n", | |
")\n" | |
] | |
} | |
], | |
"source": [ | |
"import torch\n", | |
"from torch.autograd import Variable\n", | |
"import torch.nn as nn\n", | |
"import torch.nn.functional as F\n", | |
"\n", | |
"\n", | |
"class Net(nn.Module):\n", | |
"\n", | |
" def __init__(self):\n", | |
" super(Net, self).__init__()\n", | |
" # 1 input image channel, 6 output channels, 5x5 square convolution\n", | |
" # kernel\n", | |
" self.conv1 = nn.Conv2d(1, 6, 5)\n", | |
" self.conv2 = nn.Conv2d(6, 16, 5)\n", | |
" # an affine operation: y = Wx + b\n", | |
" self.fc1 = nn.Linear(16 * 5 * 5, 120)\n", | |
" self.fc2 = nn.Linear(120, 84)\n", | |
" self.fc3 = nn.Linear(84, 10)\n", | |
"\n", | |
" def forward(self, x):\n", | |
" # Max pooling over a (2, 2) window\n", | |
" x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n", | |
" # If the size is a square you can only specify a single number\n", | |
" x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n", | |
" x = x.view(-1, self.num_flat_features(x))\n", | |
" x = F.relu(self.fc1(x))\n", | |
" x = F.relu(self.fc2(x))\n", | |
" x = self.fc3(x)\n", | |
" return x\n", | |
"\n", | |
" def num_flat_features(self, x):\n", | |
" size = x.size()[1:] # all dimensions except the batch dimension\n", | |
" num_features = 1\n", | |
" for s in size:\n", | |
" num_features *= s\n", | |
" return num_features\n", | |
"\n", | |
"\n", | |
"net = Net()\n", | |
"print(net)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"在你正確定義好 `forward()` 以後,input 讀進你的神經網路即可計算出結果。 <br>\n", | |
"後面使用 `backward()` 也會幫你計算所有參數的梯度。\n", | |
"\n", | |
"Input `Tensor` 需要先轉型成 `Variable` 物件後才可以讀入神經網路,其 output 也是 `Variable` 物件。 <br>\n", | |
"以下是一個 forward 範例:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Variable containing:\n", | |
" 0.0144 -0.0174 0.0186 -0.1524 0.1014 0.0666 -0.0254 0.0241 -0.0485 -0.0240\n", | |
"[torch.FloatTensor of size 1x10]\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"input = Variable(torch.randn(1, 1, 32, 32))\n", | |
"out = net(input)\n", | |
"print(out)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"#### 2. 定義 optimizer\n", | |
"\n", | |
"首先從 `torch.optim` 尋找你想要用的最佳化演算法, <br>\n", | |
"給 `optim` 的參數必須要提供 `iterable` 的物件,且所有物件皆為 `Variable`。<br>\n", | |
"\n", | |
"以上面的神經網路為例,可以直接使用 `net.parameters()` 來當作 `optim` 的參數, <br>\n", | |
"讓你的 optimizer 知道該最佳化的對象是你的神經網路的參數。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import torch.optim as optim\n", | |
"\n", | |
"optimizer = optim.SGD(net.parameters(), lr=0.01)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### 3. 將資料讀入並計算 loss\n", | |
"\n", | |
"**Note** <br>\n", | |
"讀入資料的方式,通常會用 `PIL.Image` 讀入圖片以後轉型成 `Tensor`。 <br>\n", | |
"但如果你要讀入的圖片只有一張時,你的圖片會是 3rd-order tensor, <br>\n", | |
"ex. shape: (3, 224, 224) <br>\n", | |
"這時候可以用 `Tensor.unsqueeze(0)` 方法強行擴展為 shape: (1, 3, 224, 224) <br>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"loss function 可以從 `torch.nn` 內挑一個自己喜歡的來用。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"criterion = nn.MSELoss()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### 4. 計算梯度\n", | |
"\n", | |
"首先,用 `optimizer.zero_grad()` 把你的神經網路內的參數 `.grad` 都歸零。 <br>\n", | |
"接下來,用 `loss.backward()` 計算梯度並且傳遞給神經網路的所有參數。 <br>\n", | |
"最後,用 `optimizer.step()` 完成參數更新。 <br>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# in your training loop:\n", | |
"optimizer.zero_grad() # zero the gradient buffers\n", | |
"output = net(input)\n", | |
"loss = criterion(output, target)\n", | |
"\n", | |
"loss.backward()\n", | |
"optimizer.step() # Does the update" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"在官網的範例程式碼,註解上面寫了 `# in your training loop:` <br>\n", | |
"亦即,這一部分通常會放在一個 for-loop 內, <br>\n", | |
"不斷讀入 training data,然後求梯度、更新參數、最後等到 loss 收斂以後就大功告成啦。" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Data Loading and Processing\n", | |
"\n", | |
"以 Kaggle [Plant Seedlings Classification](https://www.kaggle.com/c/plant-seedlings-classification) 競賽為例。 <br>\n", | |
"這個競賽是 multi-class classification 問題,dataset 目錄結構如下:\n", | |
"\n", | |
"```\n", | |
"/plant-seedlings-classification/\n", | |
" |- train/<category>/*.png\n", | |
" |- test/*.png\n", | |
" |- sample_submission.csv\n", | |
"```\n", | |
"\n", | |
"因此,在 `train/` 資料夾用 `Path.glob('*')` 掃過一遍可以得到所有類別和 index。 <br>\n", | |
"之後再從各個類別內用 `Path.glob('*.png')` 取得所有圖片,根據資料夾名稱可以得到 label。 <br>\n", | |
"\n", | |
"若只用基本 Python 標準庫和語法,相信大家都知道如何撰寫一支程式,讀取 dataset 以後建立 training data 來訓練模型。 <br>\n", | |
"但還是有一些麻煩之處:\n", | |
"1. 如何根據 batch size 取出資料?\n", | |
"1. 如何用 multiprocessing 進行運算?\n", | |
"1. 如何在資料讀入以後順便處理?\n", | |
"\n", | |
"所幸 PyTorch 提供 `torch.utils.data.Dataset` 和 `torch.utils.data.DataLoader`, <br>\n", | |
"以及 torchvision 提供很多高階函式庫,讓處理資料變得更為容易。\n", | |
"\n", | |
"參考:\n", | |
"- http://pytorch.org/tutorials/beginner/data_loading_tutorial.html\n", | |
"- http://pytorch.org/docs/0.3.0/data.html\n", | |
"- http://pytorch.org/docs/0.3.0/torchvision/index.html\n", | |
"\n", | |
"以下是針對 Plant Seedlings Classification 競賽的 Dataset class:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/dataset.py\n", | |
"from torch.utils.data import Dataset\n", | |
"from pathlib import Path\n", | |
"from PIL import Image\n", | |
"\n", | |
"\n", | |
"class PlantSeedlingDataset(Dataset):\n", | |
" def __init__(self, root_dir, transforms=None):\n", | |
" self.root_dir = Path(root_dir)\n", | |
" self.x = []\n", | |
" self.y = []\n", | |
" self.transform = transform\n", | |
" self.num_classes = 0\n", | |
"\n", | |
" if self.root_dir.name == 'train':\n", | |
" for i, _dir in enumerate(self.root_dir.glob('*')):\n", | |
" for file in _dir.glob('*'):\n", | |
" self.x.append(file)\n", | |
" self.y.append(i)\n", | |
"\n", | |
" self.num_classes += 1\n", | |
"\n", | |
" def __len__(self):\n", | |
" return len(self.x)\n", | |
"\n", | |
" def __getitem__(self, index):\n", | |
" image = Image.open(self.x[index]).convert('RGB')\n", | |
"\n", | |
" if self.transform:\n", | |
" image = self.transforms(image)\n", | |
"\n", | |
" return image, self.y[index]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"我們建立一個類別,繼承自 `torch.utils.data.Dataset` 這個抽象類別,其中要實作:\n", | |
"1. `__len__`: 取得 dataset 大小。\n", | |
"1. `__getitem__`: 取得 data。輸出兩欄分別是 input data 和 label。 <br>\n", | |
"其中,input data 可接受的類別只有 `torch.Tensor` 或 `PIL.Image`。\n", | |
"在實作 `__getitem__` 中也可以配合 `transfroms` 來讓資料讀入的時候經過各種轉換和處理後再存起來。\n", | |
"\n", | |
"我們實作完成 `Dataset` 類別物件以後,就可以配合使用 `DataLoader`。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/train.py\n", | |
"import torch\n", | |
"import torch.nn as nn\n", | |
"from models import VGG16\n", | |
"from dataset import PlantSeedlingDataset\n", | |
"from utils import parse_args\n", | |
"from torch.autograd import Variable\n", | |
"from torch.utils.data import DataLoader\n", | |
"from torchvision import transforms\n", | |
"from pathlib import Path\n", | |
"import copy\n", | |
"\n", | |
"# ... 中略\n", | |
"\n", | |
"def train():\n", | |
" data_transform = transforms.Compose([\n", | |
" transforms.RandomResizedCrop(224),\n", | |
" transforms.ToTensor(),\n", | |
" transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n", | |
" ])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"在上面使用 `torchvision.transforms` 這個物件來定義資料讀入的過程會經過哪些轉換。 <br>\n", | |
"最猛的是 `transforms.Compose` 可以把這些轉換用 list 接起來以後,我們可以很方便的對圖片進行一連串的轉換操作。 <br>\n", | |
"\n", | |
"從上面可以看到,資料讀入之後會:\n", | |
"1. `RandomResizedCrop(224)`: 隨機在畫面上 crop 出 224x224 的圖片。\n", | |
"1. `ToTensor()`: 將 PIL.Image 物件轉換成 torch.Tensor 物件。\n", | |
"1. `Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])`: 對圖片做 normalize (3 channels)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" train_set = PlantSeedlingDataset(Path(DATASET_ROOT).joinpath('train'), data_transform)\n", | |
" data_loader = DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"```\n", | |
"DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)\n", | |
"```\n", | |
"方便我們決定使用多大的 batch size、是否對 dataset 做 shuffle、要用多少 process 進行運算。 <br>\n", | |
"剩下的就是訓練過程完整的 code。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
" model = VGG16(num_classes=train_set.num_classes)\n", | |
" model = model.cuda(CUDA_DEVICES)\n", | |
" model.train()\n", | |
"\n", | |
" best_model_params = copy.deepcopy(model.state_dict())\n", | |
" best_acc = 0.0\n", | |
" num_epochs = 50\n", | |
" criterion = nn.CrossEntropyLoss()\n", | |
" optimizer = torch.optim.SGD(params=model.parameters(), lr=0.001, momentum=0.9)\n", | |
"\n", | |
" for epoch in range(num_epochs):\n", | |
" print(f'Epoch: {epoch + 1}/{num_epochs}')\n", | |
" print('-' * len(f'Epoch: {epoch + 1}/{num_epochs}'))\n", | |
"\n", | |
" training_loss = 0.0\n", | |
" training_corrects = 0\n", | |
"\n", | |
" for i, (inputs, labels) in enumerate(data_loader):\n", | |
" inputs = Variable(inputs.cuda(CUDA_DEVICES))\n", | |
" labels = Variable(labels.cuda(CUDA_DEVICES))\n", | |
"\n", | |
" optimizer.zero_grad()\n", | |
"\n", | |
" outputs = model(inputs)\n", | |
" _, preds = torch.max(outputs.data, 1)\n", | |
" loss = criterion(outputs, labels)\n", | |
"\n", | |
" loss.backward()\n", | |
" optimizer.step()\n", | |
"\n", | |
" training_loss += loss.data[0] * inputs.size(0)\n", | |
" training_corrects += torch.sum(preds == labels.data)\n", | |
"\n", | |
" training_loss = training_loss / len(train_set)\n", | |
" training_acc = training_corrects / len(train_set)\n", | |
"\n", | |
" print(f'Training loss: {training_loss:.4f}\\taccuracy: {training_acc:.4f}\\n')\n", | |
"\n", | |
" if training_acc > best_acc:\n", | |
" best_acc = training_acc\n", | |
" best_model_params = copy.deepcopy(model.state_dict())\n", | |
"\n", | |
" model.load_state_dict(best_model_params)\n", | |
" torch.save(model, f'model-{best_acc:.02f}-best_train_acc.pth')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Save & Load Model (Parameters)\n", | |
"\n", | |
"參考:http://pytorch.org/docs/master/notes/serialization.html#recommend-saving-models\n", | |
"\n", | |
"兩種方式:\n", | |
"1. SL 參數 (官方建議使用) <br>\n", | |
" - Save\n", | |
"```\n", | |
"torch.save(the_model.state_dict(), PATH)\n", | |
"```\n", | |
" - Load\n", | |
"```\n", | |
"the_model = TheModelClass(*args, **kwargs)\n", | |
"the_model.load_state_dict(torch.load(PATH))\n", | |
"```\n", | |
"2. SL 整個 model (經改動過就無法使用) <br>\n", | |
" - Save\n", | |
"```\n", | |
"torch.save(the_model, PATH)\n", | |
"```\n", | |
" - Load\n", | |
"```\n", | |
"the_model = torch.load(PATH)\n", | |
"```\n", | |
"\n", | |
"model 慣用副檔名是 .pth,不過開發人員表示不管用什麼副檔名都不會有任何影響。" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"經過以上簡單的介紹,大家應該對 PyTorch 已經有基本了解了。 <br>\n", | |
"接下來的問題是,如何成為強者、把 PyTorch 使用地更為熟練:\n", | |
"1. 動手實作 <br>\n", | |
"這是最重要的。沒有親手實作過是不可能會寫的。\n", | |
"1. 讀官方教學 <br>\n", | |
"http://pytorch.org/tutorials/index.html\n", | |
"1. 讀官方文件 <br>\n", | |
"http://pytorch.org/docs/master/index.html\n", | |
"1. 讀 source code <br>\n", | |
"https://github.com/pytorch/pytorch\n", | |
"1. 逛官方論壇 <br>\n", | |
"https://discuss.pytorch.org/\n", | |
"1. Ptt DataScience 板\n", | |
"\n", | |
"\n", | |
"最後\n", | |
"\n", | |
"> 強者之路是寂寞的。" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"------\n", | |
"本文會放在 gist 上面更新,有任何疑問歡迎在底下發 comment。<br>\n", | |
"https://gist.github.com/remorsecs/959b2e9ce39712366cea676426a34945 <br>\n", | |
"Kaggle 比賽的 GitHub repo: <br>\n", | |
"https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
滿滿的乾貨。