Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save vikeshpandey/e1d532e18c1b98f8db93bd74c462830e to your computer and use it in GitHub Desktop.
Save vikeshpandey/e1d532e18c1b98f8db93bd74c462830e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Move your code from local Jupyter to Amazon SageMaker Studio\n",
"\n",
"This notebook is directly taken from [Official PyTorch QuickStart Tutorial Guide](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) and further adapted to add Amazon SageMaker related code in last section of the notebook."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## NOTE!! \n",
"\n",
"The section below is taken as-is from [Official PyTorch QuickStart Tutorial Guide](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"[Learn the Basics](intro.html) ||\n",
"**Quickstart** ||\n",
"[Tensors](tensorqs_tutorial.html) ||\n",
"[Datasets & DataLoaders](data_tutorial.html) ||\n",
"[Transforms](transforms_tutorial.html) ||\n",
"[Build Model](buildmodel_tutorial.html) ||\n",
"[Autograd](autogradqs_tutorial.html) ||\n",
"[Optimization](optimization_tutorial.html) ||\n",
"[Save & Load Model](saveloadrun_tutorial.html)\n",
"\n",
"# Quickstart\n",
"This section runs through the API for common tasks in machine learning. Refer to the links in each section to dive deeper.\n",
"\n",
"## Working with data\n",
"PyTorch has two [primitives to work with data](https://pytorch.org/docs/stable/data.html):\n",
"``torch.utils.data.DataLoader`` and ``torch.utils.data.Dataset``.\n",
"``Dataset`` stores the samples and their corresponding labels, and ``DataLoader`` wraps an iterable around\n",
"the ``Dataset``.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"import torch\n",
"from torch import nn\n",
"from torch.utils.data import DataLoader\n",
"from torchvision import datasets\n",
"from torchvision.transforms import ToTensor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"PyTorch offers domain-specific libraries such as [TorchText](https://pytorch.org/text/stable/index.html),\n",
"[TorchVision](https://pytorch.org/vision/stable/index.html), and [TorchAudio](https://pytorch.org/audio/stable/index.html),\n",
"all of which include datasets. For this tutorial, we will be using a TorchVision dataset.\n",
"\n",
"The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like\n",
"CIFAR, COCO ([full list here](https://pytorch.org/vision/stable/datasets.html)). In this tutorial, we\n",
"use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and\n",
"``target_transform`` to modify the samples and labels respectively.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"# Download training data from open datasets.\n",
"training_data = datasets.FashionMNIST(\n",
" root=\"data\",\n",
" train=True,\n",
" download=True,\n",
" transform=ToTensor(),\n",
")\n",
"\n",
"# Download test data from open datasets.\n",
"test_data = datasets.FashionMNIST(\n",
" root=\"data\",\n",
" train=False,\n",
" download=True,\n",
" transform=ToTensor(),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We pass the ``Dataset`` as an argument to ``DataLoader``. This wraps an iterable over our dataset, and supports\n",
"automatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e. each element\n",
"in the dataloader iterable will return a batch of 64 features and labels.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"batch_size = 64\n",
"\n",
"# Create data loaders.\n",
"train_dataloader = DataLoader(training_data, batch_size=batch_size)\n",
"test_dataloader = DataLoader(test_data, batch_size=batch_size)\n",
"\n",
"for X, y in test_dataloader:\n",
" print(f\"Shape of X [N, C, H, W]: {X.shape}\")\n",
" print(f\"Shape of y: {y.shape} {y.dtype}\")\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read more about [loading data in PyTorch](data_tutorial.html).\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--------------\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating Models\n",
"To define a neural network in PyTorch, we create a class that inherits\n",
"from [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html). We define the layers of the network\n",
"in the ``__init__`` function and specify how data will pass through the network in the ``forward`` function. To accelerate\n",
"operations in the neural network, we move it to the GPU if available.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"# Get cpu or gpu device for training.\n",
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
"print(f\"Using {device} device\")\n",
"\n",
"# Define model\n",
"class NeuralNetwork(nn.Module):\n",
" def __init__(self):\n",
" super().__init__()\n",
" self.flatten = nn.Flatten()\n",
" self.linear_relu_stack = nn.Sequential(\n",
" nn.Linear(28*28, 512),\n",
" nn.ReLU(),\n",
" nn.Linear(512, 512),\n",
" nn.ReLU(),\n",
" nn.Linear(512, 10)\n",
" )\n",
"\n",
" def forward(self, x):\n",
" x = self.flatten(x)\n",
" logits = self.linear_relu_stack(x)\n",
" return logits\n",
"\n",
"model = NeuralNetwork().to(device)\n",
"print(model)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read more about [building neural networks in PyTorch](buildmodel_tutorial.html).\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--------------\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optimizing the Model Parameters\n",
"To train a model, we need a [loss function](https://pytorch.org/docs/stable/nn.html#loss-functions)\n",
"and an [optimizer](https://pytorch.org/docs/stable/optim.html).\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"loss_fn = nn.CrossEntropyLoss()\n",
"optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and\n",
"backpropagates the prediction error to adjust the model's parameters.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"def train(dataloader, model, loss_fn, optimizer):\n",
" size = len(dataloader.dataset)\n",
" model.train()\n",
" for batch, (X, y) in enumerate(dataloader):\n",
" X, y = X.to(device), y.to(device)\n",
"\n",
" # Compute prediction error\n",
" pred = model(X)\n",
" loss = loss_fn(pred, y)\n",
"\n",
" # Backpropagation\n",
" optimizer.zero_grad()\n",
" loss.backward()\n",
" optimizer.step()\n",
"\n",
" if batch % 100 == 0:\n",
" loss, current = loss.item(), batch * len(X)\n",
" print(f\"loss: {loss:>7f} [{current:>5d}/{size:>5d}]\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We also check the model's performance against the test dataset to ensure it is learning.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"def test(dataloader, model, loss_fn):\n",
" size = len(dataloader.dataset)\n",
" num_batches = len(dataloader)\n",
" model.eval()\n",
" test_loss, correct = 0, 0\n",
" with torch.no_grad():\n",
" for X, y in dataloader:\n",
" X, y = X.to(device), y.to(device)\n",
" pred = model(X)\n",
" test_loss += loss_fn(pred, y).item()\n",
" correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n",
" test_loss /= num_batches\n",
" correct /= size\n",
" print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The training process is conducted over several iterations (*epochs*). During each epoch, the model learns\n",
"parameters to make better predictions. We print the model's accuracy and loss at each epoch; we'd like to see the\n",
"accuracy increase and the loss decrease with every epoch.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"epochs = 5\n",
"for t in range(epochs):\n",
" print(f\"Epoch {t+1}\\n-------------------------------\")\n",
" train(train_dataloader, model, loss_fn, optimizer)\n",
" test(test_dataloader, model, loss_fn)\n",
"print(\"Done!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read more about [Training your model](optimization_tutorial.html).\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--------------\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Saving Models\n",
"A common way to save a model is to serialize the internal state dictionary (containing the model parameters).\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"torch.save(model.state_dict(), \"model.pth\")\n",
"print(\"Saved PyTorch Model State to model.pth\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading Models\n",
"\n",
"The process for loading a model includes re-creating the model structure and loading\n",
"the state dictionary into it.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"model = NeuralNetwork()\n",
"model.load_state_dict(torch.load(\"model.pth\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This model can now be used to make predictions.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"classes = [\n",
" \"T-shirt/top\",\n",
" \"Trouser\",\n",
" \"Pullover\",\n",
" \"Dress\",\n",
" \"Coat\",\n",
" \"Sandal\",\n",
" \"Shirt\",\n",
" \"Sneaker\",\n",
" \"Bag\",\n",
" \"Ankle boot\",\n",
"]\n",
"\n",
"model.eval()\n",
"x, y = test_data[0][0], test_data[0][1]\n",
"with torch.no_grad():\n",
" pred = model(x)\n",
" predicted, actual = classes[pred[0].argmax(0)], classes[y]\n",
" print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Official PyTorch QuickStart Tutorial Ends here !!\n",
"\n",
"\n",
"# Introducing Amazon SageMaker related code !!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this section, we add Amazon SageMaker related code."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First of all we import sagemaker python SDK and a SageMaker managed PyTorch Framework Estimator. We also mention IAM execution role to be used for Training"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sagemaker # importing sagemaker python SDK\n",
"from sagemaker.pytorch.estimator import PyTorch # import PyTorch Estimator class \n",
"from sagemaker import get_execution_role # import fn to fetch execution role\n",
"\n",
"#Store the execution role. \n",
"#Here the same role used which was used to create a sagemaker studio user profile\n",
"execution_role = get_execution_role()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now its time to define the Estimator and start the training job using SageMaker Training APIs. Note that training is happening on a SageMaker managed training cluster and not on the notebook itself."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#Create the estimator object for PyTorch\n",
"estimator = PyTorch(\n",
" entry_point = \"train.py\", # training script\n",
" framework_version = \"1.12\", #PyTorch Framework version, keep it same as used in default example\n",
" py_version = \"py38\", # Compatible Python version to use\n",
" instance_count = 1, #number of EC2 instances needed for training\n",
" instance_type = \"ml.c5.xlarge\", #Type of EC2 instance/s needed for training\n",
" disable_profiler = True, #Disable profiler, as not needed\n",
" role = execution_role #Execution role used by training job\n",
")\n",
"\n",
"#Start the training\n",
"estimator.fit()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Changes for Part 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this section, we will supply the data from an S3 location to the Estimator and Training script. First thing is to create the sagemaker session object. We use the default sagemaker created bucket. Next, create specific paths for storing training and testing data separately."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"session = sagemaker.Session()\n",
"bucket = session.default_bucket()\n",
"\n",
"train_prefix = \"/pytorch/fashion-mnist/train\"\n",
"test_prefix = \"/pytorch/fashion-mnist/test\""
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"Upload the Training and test data to S3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sagemaker.s3 import S3Uploader\n",
"\n",
"#Upload training data\n",
"S3Uploader.upload(local_path = \"data/FashionMNIST/raw/train-images-idx3-ubyte.gz\", \n",
" desired_s3_uri = \"s3://\"+bucket+train_prefix, \n",
" kms_key=None, \n",
" sagemaker_session=session)\n",
"\n",
"S3Uploader.upload(local_path = \"data/FashionMNIST/raw/train-labels-idx1-ubyte.gz\", \n",
" desired_s3_uri = \"s3://\"+bucket+train_prefix, \n",
" kms_key=None, \n",
" sagemaker_session=session)\n",
"\n",
"#Upload test data\n",
"S3Uploader.upload(local_path = \"data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz\", \n",
" desired_s3_uri = \"s3://\"+bucket+test_prefix, \n",
" kms_key=None, \n",
" sagemaker_session=session)\n",
"\n",
"S3Uploader.upload(local_path = \"data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz\", \n",
" desired_s3_uri = \"s3://\"+bucket+test_prefix, \n",
" kms_key=None, \n",
" sagemaker_session=session)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create SageMaker training input channels to point to train and test data location from S3."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from sagemaker.inputs import TrainingInput\n",
"\n",
"train_input = TrainingInput(s3_data=\"s3://\"+bucket+train_prefix)\n",
"test_input = TrainingInput(s3_data=\"s3://\"+bucket+test_prefix)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"Create the SageMaker PyTorch Estimator, point it to channel created earlier and trigger the training job."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#Create the estimator object for PyTorch\n",
"estimator = PyTorch(\n",
" entry_point = \"train.py\", # training script\n",
" framework_version = \"1.12\", #PyTorch Framework version, keep it same as used in default example\n",
" py_version = \"py38\", # Compatible Python version to use\n",
" instance_count = 1, #number of EC2 instances needed for training\n",
" instance_type = \"ml.c5.xlarge\", #Type of EC2 instance/s needed for training\n",
" disable_profiler = True, #Disable profiler, as not needed\n",
" role = execution_role #Execution role used by training job\n",
")\n",
"\n",
"inputs = {\"train\":train_input, \"test\": test_input}\n",
"\n",
"#Start the training\n",
"\n",
"estimator.fit(inputs)"
]
}
],
"metadata": {
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (PyTorch 1.12 Python 3.8 CPU Optimized)",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:eu-west-1:470317259841:image/pytorch-1.12-cpu-py38"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment