Skip to content

Instantly share code, notes, and snippets.

@remorsecs
Last active March 12, 2024 11:41
Show Gist options
  • Save remorsecs/959b2e9ce39712366cea676426a34945 to your computer and use it in GitHub Desktop.
Save remorsecs/959b2e9ce39712366cea676426a34945 to your computer and use it in GitHub Desktop.
本文目標是讓沒寫過 PyTorch 的讀者成為 PyTorch 新手 (?)
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PyTorch 入門最速傳說"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Why is PyTorch?\n",
"\n",
"![](https://i.imgflip.com/27c19g.jpg)\n",
"\n",
"優點:\n",
"- 動態計算圖 Dynamic Computational Graphs [1]\n",
"- 設計優良的 API\n",
"- 豐富的文件\n",
"- 社群發達,更新速度快(每三個月釋出一次正式版)\n",
"\n",
"參考:\n",
"- [1] [Deep Learning with Dynamic Computation Graphs](https://arxiv.org/pdf/1702.02181.pdf)\n",
"\n",
"相關討論:\n",
"- [PyTorch or TensorFlow?](https://www.kdnuggets.com/2017/08/pytorch-tensorflow.html) (2017.08) \n",
"- [PyTorch, Dynamic Computational Graphs and Modular Deep Learning](https://medium.com/intuitionmachine/pytorch-dynamic-computational-graphs-and-modular-deep-learning-7e7f89f18d1) (2017.01)\n",
"- https://www.google.com/search?q=pytorch+vs+tensorflow\n",
"\n",
"\n",
"## Installation\n",
"\n",
"建議使用 Conda 套件管理工具。\n",
"\n",
"- Anaconda: https://anaconda.org/\n",
"- Miniconda: https://conda.io/miniconda.html\n",
"\n",
"### Dependencies\n",
"\n",
"官方建議使用 CUDA 7.5 以上 / cuDNN v6.x 以上。<br>\n",
"我建議使用 CUDA 9.0 / cuDNN v7.x。<br>\n",
"- CUDA: https://developer.nvidia.com/cuda-90-download-archive\n",
"- cuDNN: https://developer.nvidia.com/cudnn\n",
"\n",
"### Binaries\n",
"\n",
"**On Linux**\n",
"\n",
"參考:http://pytorch.org/\n",
"\n",
"選擇 Python, CUDA 版本以及使用套件管理工具。<br>\n",
"選好以後直接複製指令即可安裝。\n",
"\n",
"範例\n",
"\n",
"```Bash\n",
"conda create -n pytorch python=3.6\n",
"conda install pytorch torchvision cuda90 -c pytorch\n",
"```\n",
"\n",
"**On Windows**\n",
"\n",
"官方沒有提供給 Windows 用戶的 binary files,通常必須載 source code 自己編。<br>\n",
"可以從 [peterjc123](https://github.com/peterjc123/) 下載他編好的。\n",
"\n",
"參考:https://github.com/peterjc123/pytorch-scripts#easy-installation\n",
"\n",
"範例\n",
"\n",
"```PowerShell\n",
"conda create -n pytorch python=3.6\n",
"conda install -c peterjc123 pytorch cuda90\n",
"```\n",
"\n",
"注意他沒有在 conda 提供 torchvision,可直接用 pip 或自己另外找 wheel 檔安裝。<br>\n",
"\n",
"參考:https://pypi.python.org/pypi/torchvision\n",
"\n",
"範例\n",
"\n",
"```PowerShell\n",
"pip install torchvision\n",
"```\n",
"\n",
"### From Source\n",
"\n",
"自行參考:https://github.com/pytorch/pytorch#from-source\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"## Tutorial\n",
"\n",
"官網: http://pytorch.org/tutorials/index.html"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tensor\n",
"\n",
"Tensor: 張量\n",
"\n",
"- 0th-order tensor: scalar <br>\n",
"- 1st-order tensor: Euclidean vector (vector) <br>\n",
"ex. 2D-velocity <x, y>, shape = (2)\n",
"- 2nd-order tensor: matrix <br>\n",
"ex. gray-scale image (224x224), shape = (224, 224)\n",
"- 3rd-order tensor <br>\n",
"ex. color image (3x224x224), shape = (3, 224, 224)\n",
"- 4th-order tensor <br>\n",
"ex. 32-batch-size color image (32x3x224x224), shape = (32, 3, 224, 224)\n",
"\n",
"**Note**\n",
"\n",
"需要注意 PyTorch 使用 `torch.nn.Conv2d` 時, <br>\n",
"輸入圖片的 tensor 擺放必須為 `(nSamples, nChannels, Height, Width)`\n",
"\n",
"關於 Tensor 的定義,請參考:https://en.wikipedia.org/wiki/Tensor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### torch.Tensor\n",
"\n",
"參考:http://pytorch.org/docs/master/tensors.html\n",
"\n",
"一種類似 `numpy.ndarray` 的資料結構,可以進行幾乎等同於 `numpy.ndarray` 的運算。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: \n",
"1.00000e-26 *\n",
" 3.3605 0.0000 2.1737\n",
" 0.0000 0.0000 0.0000\n",
" 0.0000 0.0000 3.3609\n",
" 0.0000 2.1737 0.0000\n",
" 0.0000 0.0000 0.0000\n",
"[torch.FloatTensor of size 5x3]\n",
"\n",
"y: [[3.36046857e-26 9.20653091e-43 2.17367911e-26]\n",
" [9.20653091e-43 0.00000000e+00 0.00000000e+00]\n",
" [2.10194770e-44 0.00000000e+00 3.36086300e-26]\n",
" [9.20653091e-43 2.17367911e-26 9.20653091e-43]\n",
" [0.00000000e+00 0.00000000e+00 2.10194770e-44]]\n"
]
}
],
"source": [
"import torch\n",
"import numpy as np\n",
"\n",
"x = torch.Tensor(5, 3)\n",
"print(f'x: {x}')\n",
"y = np.ndarray((5, 3))\n",
"print(f'y: {y}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"可以使用 Python 原生的數學運算子。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"y: \n",
" 10\n",
" 20\n",
" 30\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"z = x + y: \n",
" 11\n",
" 22\n",
" 33\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"z = x.add(y): \n",
" 11\n",
" 22\n",
" 33\n",
"[torch.FloatTensor of size 3]\n",
"\n"
]
}
],
"source": [
"x = torch.Tensor([1, 2, 3])\n",
"print(f'x: {x}')\n",
"\n",
"y = torch.Tensor([10, 20, 30])\n",
"print(f'y: {y}')\n",
"\n",
"z = x + y\n",
"print(f'z = x + y: {z}')\n",
"\n",
"z = x.add(y)\n",
"print(f'z = x.add(y): {z}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"所有運算子後面加個底線 _ 的都是 **in-place operation**。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"z = x.add(y): \n",
" 11\n",
" 22\n",
" 33\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"x (x.add(y)): \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"x (x.add_(y)): \n",
" 11\n",
" 22\n",
" 33\n",
"[torch.FloatTensor of size 3]\n",
"\n"
]
}
],
"source": [
"x = torch.Tensor([1, 2, 3])\n",
"y = torch.Tensor([10, 20, 30])\n",
"\n",
"z = x.add(y)\n",
"\n",
"print(f'z = x.add(y): {z}')\n",
"print(f'x (x.add(y)): {x}')\n",
"\n",
"x.add_(y) # in-place operation\n",
"\n",
"print(f'x (x.add_(y)): {x}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"\n",
"### torch.Tensor <=> numpy.ndarray\n",
"\n",
"`torch.Tensor` 和 `numpy.ndarray` 兩個物件可以互相轉換。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"y: [1. 2. 3.]\n",
"\n",
"z: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n"
]
}
],
"source": [
"x = torch.Tensor([1, 2, 3])\n",
"print(f'x: {x}')\n",
"\n",
"# torch.Tensor => NumPy.ndarray\n",
"y = x.numpy()\n",
"print(f'y: {y}\\n')\n",
"\n",
"# torch.Tensor <= NumPy.ndarray\n",
"z = torch.from_numpy(y)\n",
"print(f'z: {z}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"### torch.Tensor <=> torch.cuda.Tensor\n",
"對 `Tensor` 物件使用 `.cuda()` 即可置入 GPU 進行運算。<br>\n",
"如果有多個 GPU,可以加入參數來指定要使用哪一個。\n",
"\n",
"`torch.cuda.Tensor` 可以使用 `.cpu()` 變回 `torch.Tensor` 物件。"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n",
"y: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.cuda.FloatTensor of size 3 (GPU 0)]\n",
"\n",
"z: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.cuda.FloatTensor of size 3 (GPU 1)]\n",
"\n",
"z: \n",
" 1\n",
" 2\n",
" 3\n",
"[torch.FloatTensor of size 3]\n",
"\n"
]
}
],
"source": [
"x = torch.Tensor([1, 2, 3])\n",
"print(f'x: {x}')\n",
"\n",
"y = x.cuda()\n",
"print(f'y: {y}')\n",
"\n",
"z = y.cuda(1)\n",
"print(f'z: {z}')\n",
"\n",
"z = z.cpu()\n",
"print(f'z: {z}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### torch.autograd.Variable\n",
"\n",
"參考:http://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html\n",
"\n",
"現在我們知道如何使用 `Tensor` 儲存資料了。<br>\n",
"接下來的問題是:如何對變數求導、進一步實現 backpropagation?\n",
"\n",
"PyTorch 提供了 `torch.autograd.Variable` package,可以將 `Tensor` 物件存入 `Variable` 內,<br>\n",
"並且設定參數 `requires_grad=True` 後再使用 `.backward()` 即可計算梯度。\n",
"\n",
"將 `Tensor` 存入 `Variable` 以後,<br>\n",
"在 `.data` 內可以得到你存入的 `Tensor` 物件,在你使用 `.backward()` 以後會將梯度存入 `.grad`。<br>\n",
"\n",
"![](http://pytorch.org/tutorials/_images/Variable.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"#### Example.\n",
"\n",
"$$\n",
"x = 10\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"y = (2x + 2)^2\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x:\n",
"Variable containing:\n",
" 10\n",
"[torch.FloatTensor of size 1]\n",
"\n",
"y:\n",
"Variable containing:\n",
" 484\n",
"[torch.FloatTensor of size 1]\n",
"\n"
]
}
],
"source": [
"from torch.autograd import Variable\n",
"\n",
"x = torch.Tensor([10])\n",
"x = Variable(x, requires_grad=True)\n",
"print(f'x:\\n{x}')\n",
"\n",
"y = (2*x + 2)**2\n",
"print(f'y:\\n{y}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"$$\n",
"\\frac {dy} {dx} = 8x+8\\bigr\\rvert_{x=10} = 88\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x.grad:\n",
"Variable containing:\n",
" 88\n",
"[torch.FloatTensor of size 1]\n",
"\n"
]
}
],
"source": [
"y.backward()\n",
"print(f'x.grad:\\n{x.grad}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Neural Network\n",
"\n",
"參考:http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html\n",
"\n",
"訓練神經網路的過程大致如下:\n",
"1. 使用 `torch.nn` 定義神經網路\n",
"1. 使用 `torch.optim` 定義 optimizer,決定用何種策略來最佳化你的參數\n",
"1. 將資料讀入、並計算 loss\n",
"1. 計算梯度,並且用 backpropagation 修正參數"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"以下使用官網提供的例子:\n",
"\n",
"#### 1. 定義神經網路\n",
"\n",
"一般來說,會讓神經網路繼承 `nn.Module`,繼承此類別必須 implement `forward()`。\n",
"- `__init__()` 定義你的神經網路有哪些 layers。\n",
"- `forward()` 決定你的神經網路在輸入 input 以後,資料會如何經過你的 layers 進行運算,最後輸出結果。"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Net(\n",
" (conv1): Conv2d (1, 6, kernel_size=(5, 5), stride=(1, 1))\n",
" (conv2): Conv2d (6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
" (fc1): Linear(in_features=400, out_features=120)\n",
" (fc2): Linear(in_features=120, out_features=84)\n",
" (fc3): Linear(in_features=84, out_features=10)\n",
")\n"
]
}
],
"source": [
"import torch\n",
"from torch.autograd import Variable\n",
"import torch.nn as nn\n",
"import torch.nn.functional as F\n",
"\n",
"\n",
"class Net(nn.Module):\n",
"\n",
" def __init__(self):\n",
" super(Net, self).__init__()\n",
" # 1 input image channel, 6 output channels, 5x5 square convolution\n",
" # kernel\n",
" self.conv1 = nn.Conv2d(1, 6, 5)\n",
" self.conv2 = nn.Conv2d(6, 16, 5)\n",
" # an affine operation: y = Wx + b\n",
" self.fc1 = nn.Linear(16 * 5 * 5, 120)\n",
" self.fc2 = nn.Linear(120, 84)\n",
" self.fc3 = nn.Linear(84, 10)\n",
"\n",
" def forward(self, x):\n",
" # Max pooling over a (2, 2) window\n",
" x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
" # If the size is a square you can only specify a single number\n",
" x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
" x = x.view(-1, self.num_flat_features(x))\n",
" x = F.relu(self.fc1(x))\n",
" x = F.relu(self.fc2(x))\n",
" x = self.fc3(x)\n",
" return x\n",
"\n",
" def num_flat_features(self, x):\n",
" size = x.size()[1:] # all dimensions except the batch dimension\n",
" num_features = 1\n",
" for s in size:\n",
" num_features *= s\n",
" return num_features\n",
"\n",
"\n",
"net = Net()\n",
"print(net)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"在你正確定義好 `forward()` 以後,input 讀進你的神經網路即可計算出結果。 <br>\n",
"後面使用 `backward()` 也會幫你計算所有參數的梯度。\n",
"\n",
"Input `Tensor` 需要先轉型成 `Variable` 物件後才可以讀入神經網路,其 output 也是 `Variable` 物件。 <br>\n",
"以下是一個 forward 範例:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Variable containing:\n",
" 0.0144 -0.0174 0.0186 -0.1524 0.1014 0.0666 -0.0254 0.0241 -0.0485 -0.0240\n",
"[torch.FloatTensor of size 1x10]\n",
"\n"
]
}
],
"source": [
"input = Variable(torch.randn(1, 1, 32, 32))\n",
"out = net(input)\n",
"print(out)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"#### 2. 定義 optimizer\n",
"\n",
"首先從 `torch.optim` 尋找你想要用的最佳化演算法, <br>\n",
"給 `optim` 的參數必須要提供 `iterable` 的物件,且所有物件皆為 `Variable`。<br>\n",
"\n",
"以上面的神經網路為例,可以直接使用 `net.parameters()` 來當作 `optim` 的參數, <br>\n",
"讓你的 optimizer 知道該最佳化的對象是你的神經網路的參數。"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"import torch.optim as optim\n",
"\n",
"optimizer = optim.SGD(net.parameters(), lr=0.01)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3. 將資料讀入並計算 loss\n",
"\n",
"**Note** <br>\n",
"讀入資料的方式,通常會用 `PIL.Image` 讀入圖片以後轉型成 `Tensor`。 <br>\n",
"但如果你要讀入的圖片只有一張時,你的圖片會是 3rd-order tensor, <br>\n",
"ex. shape: (3, 224, 224) <br>\n",
"這時候可以用 `Tensor.unsqueeze(0)` 方法強行擴展為 shape: (1, 3, 224, 224) <br>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"loss function 可以從 `torch.nn` 內挑一個自己喜歡的來用。"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"criterion = nn.MSELoss()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4. 計算梯度\n",
"\n",
"首先,用 `optimizer.zero_grad()` 把你的神經網路內的參數 `.grad` 都歸零。 <br>\n",
"接下來,用 `loss.backward()` 計算梯度並且傳遞給神經網路的所有參數。 <br>\n",
"最後,用 `optimizer.step()` 完成參數更新。 <br>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# in your training loop:\n",
"optimizer.zero_grad() # zero the gradient buffers\n",
"output = net(input)\n",
"loss = criterion(output, target)\n",
"\n",
"loss.backward()\n",
"optimizer.step() # Does the update"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"在官網的範例程式碼,註解上面寫了 `# in your training loop:` <br>\n",
"亦即,這一部分通常會放在一個 for-loop 內, <br>\n",
"不斷讀入 training data,然後求梯度、更新參數、最後等到 loss 收斂以後就大功告成啦。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Loading and Processing\n",
"\n",
"以 Kaggle [Plant Seedlings Classification](https://www.kaggle.com/c/plant-seedlings-classification) 競賽為例。 <br>\n",
"這個競賽是 multi-class classification 問題,dataset 目錄結構如下:\n",
"\n",
"```\n",
"/plant-seedlings-classification/\n",
" |- train/<category>/*.png\n",
" |- test/*.png\n",
" |- sample_submission.csv\n",
"```\n",
"\n",
"因此,在 `train/` 資料夾用 `Path.glob('*')` 掃過一遍可以得到所有類別和 index。 <br>\n",
"之後再從各個類別內用 `Path.glob('*.png')` 取得所有圖片,根據資料夾名稱可以得到 label。 <br>\n",
"\n",
"若只用基本 Python 標準庫和語法,相信大家都知道如何撰寫一支程式,讀取 dataset 以後建立 training data 來訓練模型。 <br>\n",
"但還是有一些麻煩之處:\n",
"1. 如何根據 batch size 取出資料?\n",
"1. 如何用 multiprocessing 進行運算?\n",
"1. 如何在資料讀入以後順便處理?\n",
"\n",
"所幸 PyTorch 提供 `torch.utils.data.Dataset` 和 `torch.utils.data.DataLoader`, <br>\n",
"以及 torchvision 提供很多高階函式庫,讓處理資料變得更為容易。\n",
"\n",
"參考:\n",
"- http://pytorch.org/tutorials/beginner/data_loading_tutorial.html\n",
"- http://pytorch.org/docs/0.3.0/data.html\n",
"- http://pytorch.org/docs/0.3.0/torchvision/index.html\n",
"\n",
"以下是針對 Plant Seedlings Classification 競賽的 Dataset class:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/dataset.py\n",
"from torch.utils.data import Dataset\n",
"from pathlib import Path\n",
"from PIL import Image\n",
"\n",
"\n",
"class PlantSeedlingDataset(Dataset):\n",
" def __init__(self, root_dir, transforms=None):\n",
" self.root_dir = Path(root_dir)\n",
" self.x = []\n",
" self.y = []\n",
" self.transform = transform\n",
" self.num_classes = 0\n",
"\n",
" if self.root_dir.name == 'train':\n",
" for i, _dir in enumerate(self.root_dir.glob('*')):\n",
" for file in _dir.glob('*'):\n",
" self.x.append(file)\n",
" self.y.append(i)\n",
"\n",
" self.num_classes += 1\n",
"\n",
" def __len__(self):\n",
" return len(self.x)\n",
"\n",
" def __getitem__(self, index):\n",
" image = Image.open(self.x[index]).convert('RGB')\n",
"\n",
" if self.transform:\n",
" image = self.transforms(image)\n",
"\n",
" return image, self.y[index]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"我們建立一個類別,繼承自 `torch.utils.data.Dataset` 這個抽象類別,其中要實作:\n",
"1. `__len__`: 取得 dataset 大小。\n",
"1. `__getitem__`: 取得 data。輸出兩欄分別是 input data 和 label。 <br>\n",
"其中,input data 可接受的類別只有 `torch.Tensor` 或 `PIL.Image`。\n",
"在實作 `__getitem__` 中也可以配合 `transfroms` 來讓資料讀入的時候經過各種轉換和處理後再存起來。\n",
"\n",
"我們實作完成 `Dataset` 類別物件以後,就可以配合使用 `DataLoader`。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example/blob/master/train.py\n",
"import torch\n",
"import torch.nn as nn\n",
"from models import VGG16\n",
"from dataset import PlantSeedlingDataset\n",
"from utils import parse_args\n",
"from torch.autograd import Variable\n",
"from torch.utils.data import DataLoader\n",
"from torchvision import transforms\n",
"from pathlib import Path\n",
"import copy\n",
"\n",
"# ... 中略\n",
"\n",
"def train():\n",
" data_transform = transforms.Compose([\n",
" transforms.RandomResizedCrop(224),\n",
" transforms.ToTensor(),\n",
" transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n",
" ])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"在上面使用 `torchvision.transforms` 這個物件來定義資料讀入的過程會經過哪些轉換。 <br>\n",
"最猛的是 `transforms.Compose` 可以把這些轉換用 list 接起來以後,我們可以很方便的對圖片進行一連串的轉換操作。 <br>\n",
"\n",
"從上面可以看到,資料讀入之後會:\n",
"1. `RandomResizedCrop(224)`: 隨機在畫面上 crop 出 224x224 的圖片。\n",
"1. `ToTensor()`: 將 PIL.Image 物件轉換成 torch.Tensor 物件。\n",
"1. `Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])`: 對圖片做 normalize (3 channels)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
" train_set = PlantSeedlingDataset(Path(DATASET_ROOT).joinpath('train'), data_transform)\n",
" data_loader = DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"DataLoader(dataset=train_set, batch_size=32, shuffle=True, num_workers=1)\n",
"```\n",
"方便我們決定使用多大的 batch size、是否對 dataset 做 shuffle、要用多少 process 進行運算。 <br>\n",
"剩下的就是訓練過程完整的 code。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
" model = VGG16(num_classes=train_set.num_classes)\n",
" model = model.cuda(CUDA_DEVICES)\n",
" model.train()\n",
"\n",
" best_model_params = copy.deepcopy(model.state_dict())\n",
" best_acc = 0.0\n",
" num_epochs = 50\n",
" criterion = nn.CrossEntropyLoss()\n",
" optimizer = torch.optim.SGD(params=model.parameters(), lr=0.001, momentum=0.9)\n",
"\n",
" for epoch in range(num_epochs):\n",
" print(f'Epoch: {epoch + 1}/{num_epochs}')\n",
" print('-' * len(f'Epoch: {epoch + 1}/{num_epochs}'))\n",
"\n",
" training_loss = 0.0\n",
" training_corrects = 0\n",
"\n",
" for i, (inputs, labels) in enumerate(data_loader):\n",
" inputs = Variable(inputs.cuda(CUDA_DEVICES))\n",
" labels = Variable(labels.cuda(CUDA_DEVICES))\n",
"\n",
" optimizer.zero_grad()\n",
"\n",
" outputs = model(inputs)\n",
" _, preds = torch.max(outputs.data, 1)\n",
" loss = criterion(outputs, labels)\n",
"\n",
" loss.backward()\n",
" optimizer.step()\n",
"\n",
" training_loss += loss.data[0] * inputs.size(0)\n",
" training_corrects += torch.sum(preds == labels.data)\n",
"\n",
" training_loss = training_loss / len(train_set)\n",
" training_acc = training_corrects / len(train_set)\n",
"\n",
" print(f'Training loss: {training_loss:.4f}\\taccuracy: {training_acc:.4f}\\n')\n",
"\n",
" if training_acc > best_acc:\n",
" best_acc = training_acc\n",
" best_model_params = copy.deepcopy(model.state_dict())\n",
"\n",
" model.load_state_dict(best_model_params)\n",
" torch.save(model, f'model-{best_acc:.02f}-best_train_acc.pth')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Save & Load Model (Parameters)\n",
"\n",
"參考:http://pytorch.org/docs/master/notes/serialization.html#recommend-saving-models\n",
"\n",
"兩種方式:\n",
"1. SL 參數 (官方建議使用) <br>\n",
" - Save\n",
"```\n",
"torch.save(the_model.state_dict(), PATH)\n",
"```\n",
" - Load\n",
"```\n",
"the_model = TheModelClass(*args, **kwargs)\n",
"the_model.load_state_dict(torch.load(PATH))\n",
"```\n",
"2. SL 整個 model (經改動過就無法使用) <br>\n",
" - Save\n",
"```\n",
"torch.save(the_model, PATH)\n",
"```\n",
" - Load\n",
"```\n",
"the_model = torch.load(PATH)\n",
"```\n",
"\n",
"model 慣用副檔名是 .pth,不過開發人員表示不管用什麼副檔名都不會有任何影響。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"經過以上簡單的介紹,大家應該對 PyTorch 已經有基本了解了。 <br>\n",
"接下來的問題是,如何成為強者、把 PyTorch 使用地更為熟練:\n",
"1. 動手實作 <br>\n",
"這是最重要的。沒有親手實作過是不可能會寫的。\n",
"1. 讀官方教學 <br>\n",
"http://pytorch.org/tutorials/index.html\n",
"1. 讀官方文件 <br>\n",
"http://pytorch.org/docs/master/index.html\n",
"1. 讀 source code <br>\n",
"https://github.com/pytorch/pytorch\n",
"1. 逛官方論壇 <br>\n",
"https://discuss.pytorch.org/\n",
"1. Ptt DataScience 板\n",
"\n",
"\n",
"最後\n",
"\n",
"> 強者之路是寂寞的。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------\n",
"本文會放在 gist 上面更新,有任何疑問歡迎在底下發 comment。<br>\n",
"https://gist.github.com/remorsecs/959b2e9ce39712366cea676426a34945 <br>\n",
"Kaggle 比賽的 GitHub repo: <br>\n",
"https://github.com/remorsecs/Kaggle-Plant-Seedlings-Classification-Example"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
@AUDOSt0ck1ng
Copy link

滿滿的乾貨。

@ricky-696
Copy link

從入門到放棄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment