Skip to content

Instantly share code, notes, and snippets.

@bllchmbrs
Created April 16, 2020 20:48
Show Gist options
  • Save bllchmbrs/417cdc19b7f855bcfc96e0bb1d14ec9e to your computer and use it in GitHub Desktop.
Save bllchmbrs/417cdc19b7f855bcfc96e0bb1d14ec9e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Hyperparameter Tuning with PyTorch & RayTune\n",
"\n",
"This notebook will walk you through the basics of using [RayTune](https://ray.readthedocs.io/en/latest/tune.html). We'll do so with a PyTorch model in this example.\n",
"\n",
"We'll follow a simple process:\n",
"1. We'll first create a model and train it, just like we might on a single node.\n",
"2. We'll then make the slight modifications to turn it into a distributed hyperparameter search.\n",
"3. We'll then run it on RayTune and see the results.\n",
"\n",
"\n",
"Let's go ahead and get started, first we're going start off with our core imports. We'll be training on the MNIST dataset with a ConvNet model."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"import os \n",
"\n",
"import ray\n",
"\n",
"from torchvision import datasets, transforms\n",
"\n",
"import torch\n",
"import torch.optim as optim\n",
"import torch.nn as nn\n",
"import torch.nn.functional as F\n",
"\n",
"from filelock import FileLock"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll set our global variables for epochs and test size."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"EPOCH_SIZE = 512\n",
"TEST_SIZE = 256"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Single Node PyTorch Hyperparameter Tuning\n",
"\n",
"Our example will follow nearly the exact same code that you can find in the [PyTorch MNIST example here](https://github.com/pytorch/examples/blob/master/mnist/main.py).\n",
"\n",
"You'll see that we create an even simpler model than in that example, however you can use that one if you wish to try and make some better predictions."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"class ConvNet(nn.Module):\n",
" def __init__(self):\n",
" super(ConvNet, self).__init__()\n",
" self.conv1 = nn.Conv2d(1, 3, kernel_size=3)\n",
" self.fc = nn.Linear(192, 10)\n",
"\n",
" def forward(self, x):\n",
" x = F.relu(F.max_pool2d(self.conv1(x), 3))\n",
" x = x.view(-1, 192)\n",
" x = self.fc(x)\n",
" return F.log_softmax(x, dim=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After creating that network, we can now create our data loaders for training and test data. These are just plain [PyTorch dataloaders](https://pytorch.org/docs/1.1.0/data.html?highlight=dataloader#torch.utils.data.DataLoader) except that we've added a `FileLock` to ensure that only one process downloads the data on each machine (if we have multiple workers / machine on our Ray cluster).\n",
"\n",
"Other than that, there's nothing that's changed from the [PyTorch example version](https://github.com/pytorch/examples/blob/master/mnist/main.py#L101)."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def get_data_loaders():\n",
" mnist_transforms = transforms.Compose(\n",
" [transforms.ToTensor(),\n",
" transforms.Normalize((0.1307, ), (0.3081, ))])\n",
"\n",
" # We add FileLock here because multiple workers will want to\n",
" # download data, and this may cause overwrites since\n",
" # DataLoader is not threadsafe.\n",
" # This is only relevant in the distributed \n",
" with FileLock(os.path.expanduser(\"~/data.lock\")):\n",
" train_loader = torch.utils.data.DataLoader(\n",
" datasets.MNIST(\n",
" \"/tmp/data\",\n",
" train=True,\n",
" download=True,\n",
" transform=mnist_transforms),\n",
" batch_size=64,\n",
" shuffle=True)\n",
"\n",
" test_loader = torch.utils.data.DataLoader(\n",
" datasets.MNIST(\"/tmp/data\", train=False, transform=mnist_transforms),\n",
" batch_size=64,\n",
" shuffle=True)\n",
" return train_loader, test_loader"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We defined how we're going to download / load the data [and preprocess it]. Now it's time to define our training and test functions. While the arguments are a bit switched up from the PyTorch tutorial we've referenced, the difference is inconsequential. We're going to take an optimizer, a model, the train loader, specify our device and then train the model."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def train(model, optimizer, train_loader, device=torch.device(\"cpu\")):\n",
" model.train()\n",
" for batch_idx, (data, target) in enumerate(train_loader):\n",
" if batch_idx * len(data) > EPOCH_SIZE:\n",
" return\n",
" data, target = data.to(device), target.to(device)\n",
" optimizer.zero_grad()\n",
" output = model(data)\n",
" loss = F.nll_loss(output, target)\n",
" loss.backward()\n",
" optimizer.step()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's the same story for our test model. We've defined some basic `average correct prediction` metric that we'll be tracking here. We could add / calculate more as well - we're just keeping it simple."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def test(model, data_loader, device=torch.device(\"cpu\")):\n",
" model.eval()\n",
" correct = 0\n",
" total = 0\n",
" with torch.no_grad():\n",
" for batch_idx, (data, target) in enumerate(data_loader):\n",
" if batch_idx * len(data) > TEST_SIZE:\n",
" break\n",
" data, target = data.to(device), target.to(device)\n",
" outputs = model(data)\n",
" _, predicted = torch.max(outputs.data, 1)\n",
" total += target.size(0)\n",
" correct += (predicted == target).sum().item()\n",
"\n",
" return correct / total"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lastly, we'll create a wrapper function for this particular model. In doing so all we need to do is specify the configuration for the model that we would like to train and the function gets the data, creates the model, and optimizes it accordingly."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"def train_mnist(config):\n",
" train_loader, test_loader = get_data_loaders()\n",
" model = ConvNet()\n",
" optimizer = optim.SGD(model.parameters(), lr=config[\"lr\"], momentum=config['momentum'])\n",
" for i in range(10):\n",
" train(model, optimizer, train_loader)\n",
" acc = test(model, test_loader)\n",
" print(acc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Single-Node Hyperparameter Tuning\n",
"\n",
"Now, let's show what we might have to do if we were going to perform hyperparameter tuning on a single machine. We would have to enumerate all the possibilities and either train them serially or use something like multiprocessing to train them in parallel. That setup takes a little bit of work so often times people opt to train them serially and just wait for it to take a long time.\n",
"\n",
"This is what that might end up looking like."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"12\n"
]
}
],
"source": [
"import itertools\n",
"conf = {\n",
" \"lr\": [0.001, 0.01, 0.1],\n",
" \"momentum\": [0.001, 0.01, 0.1, 0.9]\n",
"}\n",
"\n",
"combinations = list(itertools.product(*conf.values()))\n",
"print(len(combinations))"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.175\n",
"0.228125\n",
"0.30625\n",
"0.36875\n",
"0.409375\n",
"0.571875\n",
"0.61875\n",
"0.6625\n",
"0.7125\n",
"0.75625\n"
]
}
],
"source": [
"for lr, momentum in combinations:\n",
" train_mnist({\"lr\":lr, \"momentum\":momentum})\n",
" break # we'll stop this after one run and just use it for illustrative purposes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### RayTune: Distributed Hyperparameter Tuning\n",
"\n",
"Now, we've seen how you might approach the problem in a single node world. With RayTune, it becomes trivial to move your code from a single node to multiple nodes. Let's take a look at the changes that we're going to need to do achieve that.\n",
"\n",
"First, let's import Ray and initialize our Ray application on the cluster."
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"import ray\n",
"\n",
"ray.shutdown()\n",
"# ray.init(address='auto')\n",
"from ray import tune"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first minor change is that we'll specify that we want to perform a strict `grid_search` on our hyperparameters."
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [],
"source": [
"conf = {\n",
" \"lr\": tune.grid_search([0.001, 0.01, 0.1]),\n",
" \"momentum\": tune.grid_search([0.001, 0.01, 0.1, 0.9])\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's take our simple training function and add a single line: `tune.track.log(mean_accuracy=acc)`.\n",
"\n",
"That's all that we need to change in order for RayTune to be able to parallelize our different hyperparameter combinations. When we're executing a hyperparameter sweep, we're executing an **experiment**. Each distinct combination of our different hyperparameters is a single **trials**.\n",
"\n",
"In the following example, we're using the **functional API**, this makes it easy to get something up and running but does provide overall less control than the **class API** [`tune.Trainable`]."
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [],
"source": [
"def train_mnist(config):\n",
" train_loader, test_loader = get_data_loaders()\n",
" model = ConvNet()\n",
" optimizer = optim.SGD(model.parameters(), lr=config[\"lr\"])\n",
" for i in range(10):\n",
" train(model, optimizer, train_loader)\n",
" acc = test(model, test_loader)\n",
" tune.track.log(mean_accuracy=acc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's an example of the **class API**. Note that `_setup` is called **once per trial**. While the number of times `_train` is called is determined by the parameter that we pass to the `tune.run` call in the cell now. `stop={\"training_iteration\": 10}`."
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [],
"source": [
"class TrainMNIST(tune.Trainable):\n",
" def _setup(self, config):\n",
" self.config = config\n",
" self.train_loader, self.test_loader = get_data_loaders()\n",
" self.model = ConvNet()\n",
" self.optimizer = optim.SGD(self.model.parameters(), lr=self.config[\"lr\"])\n",
" \n",
" def _train(self):\n",
" train(self.model, self.optimizer, self.train_loader)\n",
" acc = test(self.model, self.test_loader)\n",
" return {\"mean_accuracy\": acc}"
]
},
{
"cell_type": "code",
"execution_count": 107,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.3/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 1/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/TrainMNIST<br>Number of trials: 12 (11 PENDING, 1 RUNNING)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>TrainMNIST_00000</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>TrainMNIST_00001</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>TrainMNIST_00002</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>TrainMNIST_00003</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>TrainMNIST_00004</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>TrainMNIST_00005</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>TrainMNIST_00006</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>TrainMNIST_00007</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>TrainMNIST_00008</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>TrainMNIST_00009</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"<tr><td>TrainMNIST_00010</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"<tr><td>TrainMNIST_00011</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Result for TrainMNIST_00000:\n",
" date: 2020-04-10_12-34-36\n",
" done: false\n",
" experiment_id: 62e48d2d36e54eaca656a0426bd327dc\n",
" experiment_tag: 0_lr=0.001,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.10625\n",
" node_ip: 192.168.1.13\n",
" pid: 23084\n",
" time_since_restore: 0.32193803787231445\n",
" time_this_iter_s: 0.32193803787231445\n",
" time_total_s: 0.32193803787231445\n",
" timestamp: 1586547276\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00000'\n",
" \n",
"Result for TrainMNIST_00003:\n",
" date: 2020-04-10_12-34-36\n",
" done: false\n",
" experiment_id: ce9f54360cee4c85acb48a2d277f02e5\n",
" experiment_tag: 3_lr=0.001,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.096875\n",
" node_ip: 192.168.1.13\n",
" pid: 23083\n",
" time_since_restore: 0.3322930335998535\n",
" time_this_iter_s: 0.3322930335998535\n",
" time_total_s: 0.3322930335998535\n",
" timestamp: 1586547276\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00003'\n",
" \n",
"Result for TrainMNIST_00001:\n",
" date: 2020-04-10_12-34-36\n",
" done: false\n",
" experiment_id: 026d85fa0db84c6a8df8c3bfd44c399b\n",
" experiment_tag: 1_lr=0.01,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.2\n",
" node_ip: 192.168.1.13\n",
" pid: 23071\n",
" time_since_restore: 0.38395214080810547\n",
" time_this_iter_s: 0.38395214080810547\n",
" time_total_s: 0.38395214080810547\n",
" timestamp: 1586547276\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00001'\n",
" \n",
"Result for TrainMNIST_00004:\n",
" date: 2020-04-10_12-34-36\n",
" done: false\n",
" experiment_id: a0f936131af7463ea0cad3ca3dc8f9e2\n",
" experiment_tag: 4_lr=0.01,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.128125\n",
" node_ip: 192.168.1.13\n",
" pid: 23069\n",
" time_since_restore: 0.4091510772705078\n",
" time_this_iter_s: 0.4091510772705078\n",
" time_total_s: 0.4091510772705078\n",
" timestamp: 1586547276\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00004'\n",
" \n",
"Result for TrainMNIST_00002:\n",
" date: 2020-04-10_12-34-37\n",
" done: false\n",
" experiment_id: 3ee3a41a6d97443a8861775d1c0cf30e\n",
" experiment_tag: 2_lr=0.1,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.509375\n",
" node_ip: 192.168.1.13\n",
" pid: 23070\n",
" time_since_restore: 0.427138090133667\n",
" time_this_iter_s: 0.427138090133667\n",
" time_total_s: 0.427138090133667\n",
" timestamp: 1586547277\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00002'\n",
" \n",
"Result for TrainMNIST_00005:\n",
" date: 2020-04-10_12-34-37\n",
" done: false\n",
" experiment_id: 9f1f537e7a6a4857ab692f2e62670359\n",
" experiment_tag: 5_lr=0.1,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.56875\n",
" node_ip: 192.168.1.13\n",
" pid: 23066\n",
" time_since_restore: 0.40663814544677734\n",
" time_this_iter_s: 0.40663814544677734\n",
" time_total_s: 0.40663814544677734\n",
" timestamp: 1586547277\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00005'\n",
" \n",
"Result for TrainMNIST_00006:\n",
" date: 2020-04-10_12-34-37\n",
" done: false\n",
" experiment_id: b52ae71ec7e94216980927b0d13f3b42\n",
" experiment_tag: 6_lr=0.001,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.10625\n",
" node_ip: 192.168.1.13\n",
" pid: 23095\n",
" time_since_restore: 0.37655019760131836\n",
" time_this_iter_s: 0.37655019760131836\n",
" time_total_s: 0.37655019760131836\n",
" timestamp: 1586547277\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00006'\n",
" \n",
"Result for TrainMNIST_00007:\n",
" date: 2020-04-10_12-34-38\n",
" done: false\n",
" experiment_id: c01bd3ea383c4b96b86bd673053b3753\n",
" experiment_tag: 7_lr=0.01,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.1125\n",
" node_ip: 192.168.1.13\n",
" pid: 23094\n",
" time_since_restore: 0.4034602642059326\n",
" time_this_iter_s: 0.4034602642059326\n",
" time_total_s: 0.4034602642059326\n",
" timestamp: 1586547278\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00007'\n",
" \n",
"Result for TrainMNIST_00000:\n",
" date: 2020-04-10_12-34-39\n",
" done: true\n",
" experiment_id: 62e48d2d36e54eaca656a0426bd327dc\n",
" experiment_tag: 0_lr=0.001,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.096875\n",
" node_ip: 192.168.1.13\n",
" pid: 23084\n",
" time_since_restore: 3.436687469482422\n",
" time_this_iter_s: 0.32843708992004395\n",
" time_total_s: 3.436687469482422\n",
" timestamp: 1586547279\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00000'\n",
" \n",
"Result for TrainMNIST_00003:\n",
" date: 2020-04-10_12-34-39\n",
" done: true\n",
" experiment_id: ce9f54360cee4c85acb48a2d277f02e5\n",
" experiment_tag: 3_lr=0.001,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.090625\n",
" node_ip: 192.168.1.13\n",
" pid: 23083\n",
" time_since_restore: 3.448063373565674\n",
" time_this_iter_s: 0.3288760185241699\n",
" time_total_s: 3.448063373565674\n",
" timestamp: 1586547279\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00003'\n",
" \n",
"Result for TrainMNIST_00001:\n",
" date: 2020-04-10_12-34-39\n",
" done: true\n",
" experiment_id: 026d85fa0db84c6a8df8c3bfd44c399b\n",
" experiment_tag: 1_lr=0.01,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.74375\n",
" node_ip: 192.168.1.13\n",
" pid: 23071\n",
" time_since_restore: 3.423724412918091\n",
" time_this_iter_s: 0.35262584686279297\n",
" time_total_s: 3.423724412918091\n",
" timestamp: 1586547279\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00001'\n",
" \n",
"Result for TrainMNIST_00004:\n",
" date: 2020-04-10_12-34-40\n",
" done: true\n",
" experiment_id: a0f936131af7463ea0cad3ca3dc8f9e2\n",
" experiment_tag: 4_lr=0.01,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.76875\n",
" node_ip: 192.168.1.13\n",
" pid: 23069\n",
" time_since_restore: 3.4991531372070312\n",
" time_this_iter_s: 0.4650881290435791\n",
" time_total_s: 3.4991531372070312\n",
" timestamp: 1586547280\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00004'\n",
" \n"
]
},
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.8/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 8/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/TrainMNIST<br>Number of trials: 12 (8 RUNNING, 4 TERMINATED)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th><th style=\"text-align: right;\"> acc</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>TrainMNIST_00000</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.096875</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.43669</td></tr>\n",
"<tr><td>TrainMNIST_00001</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.74375 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.42372</td></tr>\n",
"<tr><td>TrainMNIST_00002</td><td>RUNNING </td><td>192.168.1.13:23070</td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.81875 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.10756</td></tr>\n",
"<tr><td>TrainMNIST_00003</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.090625</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.44806</td></tr>\n",
"<tr><td>TrainMNIST_00004</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.76875 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.49915</td></tr>\n",
"<tr><td>TrainMNIST_00005</td><td>RUNNING </td><td>192.168.1.13:23066</td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.825 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.09171</td></tr>\n",
"<tr><td>TrainMNIST_00006</td><td>RUNNING </td><td>192.168.1.13:23095</td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.1125 </td><td style=\"text-align: right;\"> 7</td><td style=\"text-align: right;\"> 2.40808</td></tr>\n",
"<tr><td>TrainMNIST_00007</td><td>RUNNING </td><td>192.168.1.13:23094</td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.271875</td><td style=\"text-align: right;\"> 7</td><td style=\"text-align: right;\"> 2.71676</td></tr>\n",
"<tr><td>TrainMNIST_00008</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>TrainMNIST_00009</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>TrainMNIST_00010</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>TrainMNIST_00011</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Result for TrainMNIST_00002:\n",
" date: 2020-04-10_12-34-40\n",
" done: true\n",
" experiment_id: 3ee3a41a6d97443a8861775d1c0cf30e\n",
" experiment_tag: 2_lr=0.1,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.853125\n",
" node_ip: 192.168.1.13\n",
" pid: 23070\n",
" time_since_restore: 3.7694034576416016\n",
" time_this_iter_s: 0.66184401512146\n",
" time_total_s: 3.7694034576416016\n",
" timestamp: 1586547280\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00002'\n",
" \n",
"Result for TrainMNIST_00005:\n",
" date: 2020-04-10_12-34-40\n",
" done: true\n",
" experiment_id: 9f1f537e7a6a4857ab692f2e62670359\n",
" experiment_tag: 5_lr=0.1,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.8375\n",
" node_ip: 192.168.1.13\n",
" pid: 23066\n",
" time_since_restore: 3.804111957550049\n",
" time_this_iter_s: 0.7124001979827881\n",
" time_total_s: 3.804111957550049\n",
" timestamp: 1586547280\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00005'\n",
" \n",
"Result for TrainMNIST_00006:\n",
" date: 2020-04-10_12-34-41\n",
" done: true\n",
" experiment_id: b52ae71ec7e94216980927b0d13f3b42\n",
" experiment_tag: 6_lr=0.001,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.1375\n",
" node_ip: 192.168.1.13\n",
" pid: 23095\n",
" time_since_restore: 4.054274797439575\n",
" time_this_iter_s: 0.4042949676513672\n",
" time_total_s: 4.054274797439575\n",
" timestamp: 1586547281\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00006'\n",
" \n",
"Result for TrainMNIST_00007:\n",
" date: 2020-04-10_12-34-41\n",
" done: true\n",
" experiment_id: c01bd3ea383c4b96b86bd673053b3753\n",
" experiment_tag: 7_lr=0.01,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.33125\n",
" node_ip: 192.168.1.13\n",
" pid: 23094\n",
" time_since_restore: 4.120349884033203\n",
" time_this_iter_s: 0.3263993263244629\n",
" time_total_s: 4.120349884033203\n",
" timestamp: 1586547281\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00007'\n",
" \n",
"Result for TrainMNIST_00008:\n",
" date: 2020-04-10_12-34-41\n",
" done: false\n",
" experiment_id: 9c012f6a63254dcf84feaa437842d0d8\n",
" experiment_tag: 8_lr=0.1,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.43125\n",
" node_ip: 192.168.1.13\n",
" pid: 23108\n",
" time_since_restore: 0.3416628837585449\n",
" time_this_iter_s: 0.3416628837585449\n",
" time_total_s: 0.3416628837585449\n",
" timestamp: 1586547281\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: 00008\n",
" \n",
"Result for TrainMNIST_00009:\n",
" date: 2020-04-10_12-34-42\n",
" done: false\n",
" experiment_id: 5d89e8380aa34592886927a6d93929cd\n",
" experiment_tag: 9_lr=0.001,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.09375\n",
" node_ip: 192.168.1.13\n",
" pid: 23117\n",
" time_since_restore: 0.2519979476928711\n",
" time_this_iter_s: 0.2519979476928711\n",
" time_total_s: 0.2519979476928711\n",
" timestamp: 1586547282\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: 00009\n",
" \n",
"Result for TrainMNIST_00010:\n",
" date: 2020-04-10_12-34-42\n",
" done: false\n",
" experiment_id: 5834009298c2441b86a2f31f18181e40\n",
" experiment_tag: 10_lr=0.01,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.175\n",
" node_ip: 192.168.1.13\n",
" pid: 23116\n",
" time_since_restore: 0.24416112899780273\n",
" time_this_iter_s: 0.24416112899780273\n",
" time_total_s: 0.24416112899780273\n",
" timestamp: 1586547282\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00010'\n",
" \n",
"Result for TrainMNIST_00011:\n",
" date: 2020-04-10_12-34-42\n",
" done: false\n",
" experiment_id: 57e8999280cb45ccb58932f5ffc490fb\n",
" experiment_tag: 11_lr=0.1,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.553125\n",
" node_ip: 192.168.1.13\n",
" pid: 23115\n",
" time_since_restore: 0.24759411811828613\n",
" time_this_iter_s: 0.24759411811828613\n",
" time_total_s: 0.24759411811828613\n",
" timestamp: 1586547282\n",
" timesteps_since_restore: 0\n",
" training_iteration: 1\n",
" trial_id: '00011'\n",
" \n",
"Result for TrainMNIST_00008:\n",
" date: 2020-04-10_12-34-44\n",
" done: true\n",
" experiment_id: 9c012f6a63254dcf84feaa437842d0d8\n",
" experiment_tag: 8_lr=0.1,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.8625\n",
" node_ip: 192.168.1.13\n",
" pid: 23108\n",
" time_since_restore: 2.4410650730133057\n",
" time_this_iter_s: 0.2194068431854248\n",
" time_total_s: 2.4410650730133057\n",
" timestamp: 1586547284\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: 00008\n",
" \n",
"Result for TrainMNIST_00009:\n",
" date: 2020-04-10_12-34-44\n",
" done: true\n",
" experiment_id: 5d89e8380aa34592886927a6d93929cd\n",
" experiment_tag: 9_lr=0.001,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.153125\n",
" node_ip: 192.168.1.13\n",
" pid: 23117\n",
" time_since_restore: 2.289651870727539\n",
" time_this_iter_s: 0.2291269302368164\n",
" time_total_s: 2.289651870727539\n",
" timestamp: 1586547284\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: 00009\n",
" \n",
"Result for TrainMNIST_00010:\n",
" date: 2020-04-10_12-34-44\n",
" done: true\n",
" experiment_id: 5834009298c2441b86a2f31f18181e40\n",
" experiment_tag: 10_lr=0.01,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.734375\n",
" node_ip: 192.168.1.13\n",
" pid: 23116\n",
" time_since_restore: 2.2675728797912598\n",
" time_this_iter_s: 0.23127388954162598\n",
" time_total_s: 2.2675728797912598\n",
" timestamp: 1586547284\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00010'\n",
" \n",
"Result for TrainMNIST_00011:\n",
" date: 2020-04-10_12-34-44\n",
" done: true\n",
" experiment_id: 57e8999280cb45ccb58932f5ffc490fb\n",
" experiment_tag: 11_lr=0.1,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 10\n",
" mean_accuracy: 0.88125\n",
" node_ip: 192.168.1.13\n",
" pid: 23115\n",
" time_since_restore: 2.293020486831665\n",
" time_this_iter_s: 0.21927618980407715\n",
" time_total_s: 2.293020486831665\n",
" timestamp: 1586547284\n",
" timesteps_since_restore: 0\n",
" training_iteration: 10\n",
" trial_id: '00011'\n",
" \n"
]
},
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.5/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/TrainMNIST<br>Number of trials: 12 (12 TERMINATED)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th><th style=\"text-align: right;\"> acc</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>TrainMNIST_00000</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.096875</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.43669</td></tr>\n",
"<tr><td>TrainMNIST_00001</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.74375 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.42372</td></tr>\n",
"<tr><td>TrainMNIST_00002</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.853125</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.7694 </td></tr>\n",
"<tr><td>TrainMNIST_00003</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.090625</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.44806</td></tr>\n",
"<tr><td>TrainMNIST_00004</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.76875 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.49915</td></tr>\n",
"<tr><td>TrainMNIST_00005</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.8375 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 3.80411</td></tr>\n",
"<tr><td>TrainMNIST_00006</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.1375 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 4.05427</td></tr>\n",
"<tr><td>TrainMNIST_00007</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.33125 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 4.12035</td></tr>\n",
"<tr><td>TrainMNIST_00008</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.8625 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 2.44107</td></tr>\n",
"<tr><td>TrainMNIST_00009</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.153125</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 2.28965</td></tr>\n",
"<tr><td>TrainMNIST_00010</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.734375</td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 2.26757</td></tr>\n",
"<tr><td>TrainMNIST_00011</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.88125 </td><td style=\"text-align: right;\"> 10</td><td style=\"text-align: right;\"> 2.29302</td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"analysis = tune.run(TrainMNIST, config=conf, stop={\"training_iteration\": 10})\n",
"# # to run using the functional API, run the following\n",
"# analysis = tune.run(train_mnist, config=conf)"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best config: {'lr': 0.1, 'momentum': 0.1}\n"
]
}
],
"source": [
"print(\"Best config: \", analysis.get_best_config(metric=\"mean_accuracy\"))"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [],
"source": [
"# Get a dataframe for analyzing trial results.\n",
"df = analysis.dataframe()"
]
},
{
"cell_type": "code",
"execution_count": 110,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mean_accuracy</th>\n",
" <th>done</th>\n",
" <th>timesteps_total</th>\n",
" <th>episodes_total</th>\n",
" <th>training_iteration</th>\n",
" <th>experiment_id</th>\n",
" <th>date</th>\n",
" <th>timestamp</th>\n",
" <th>time_this_iter_s</th>\n",
" <th>time_total_s</th>\n",
" <th>...</th>\n",
" <th>hostname</th>\n",
" <th>node_ip</th>\n",
" <th>time_since_restore</th>\n",
" <th>timesteps_since_restore</th>\n",
" <th>iterations_since_restore</th>\n",
" <th>trial_id</th>\n",
" <th>experiment_tag</th>\n",
" <th>config/lr</th>\n",
" <th>config/momentum</th>\n",
" <th>logdir</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>0.881250</td>\n",
" <td>True</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>57e8999280cb45ccb58932f5ffc490fb</td>\n",
" <td>2020-04-10_12-34-44</td>\n",
" <td>1586547284</td>\n",
" <td>0.219276</td>\n",
" <td>2.293020</td>\n",
" <td>...</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>2.293020</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>11</td>\n",
" <td>11_lr=0.1,momentum=0.9</td>\n",
" <td>0.10</td>\n",
" <td>0.900</td>\n",
" <td>/Users/williamchambers/ray_results/TrainMNIST/...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>0.862500</td>\n",
" <td>True</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>9c012f6a63254dcf84feaa437842d0d8</td>\n",
" <td>2020-04-10_12-34-44</td>\n",
" <td>1586547284</td>\n",
" <td>0.219407</td>\n",
" <td>2.441065</td>\n",
" <td>...</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>2.441065</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>8</td>\n",
" <td>8_lr=0.1,momentum=0.1</td>\n",
" <td>0.10</td>\n",
" <td>0.100</td>\n",
" <td>/Users/williamchambers/ray_results/TrainMNIST/...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.853125</td>\n",
" <td>True</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>3ee3a41a6d97443a8861775d1c0cf30e</td>\n",
" <td>2020-04-10_12-34-40</td>\n",
" <td>1586547280</td>\n",
" <td>0.661844</td>\n",
" <td>3.769403</td>\n",
" <td>...</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>3.769403</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>2</td>\n",
" <td>2_lr=0.1,momentum=0.001</td>\n",
" <td>0.10</td>\n",
" <td>0.001</td>\n",
" <td>/Users/williamchambers/ray_results/TrainMNIST/...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.837500</td>\n",
" <td>True</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>9f1f537e7a6a4857ab692f2e62670359</td>\n",
" <td>2020-04-10_12-34-40</td>\n",
" <td>1586547280</td>\n",
" <td>0.712400</td>\n",
" <td>3.804112</td>\n",
" <td>...</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>3.804112</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>5</td>\n",
" <td>5_lr=0.1,momentum=0.01</td>\n",
" <td>0.10</td>\n",
" <td>0.010</td>\n",
" <td>/Users/williamchambers/ray_results/TrainMNIST/...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.768750</td>\n",
" <td>True</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>a0f936131af7463ea0cad3ca3dc8f9e2</td>\n",
" <td>2020-04-10_12-34-40</td>\n",
" <td>1586547280</td>\n",
" <td>0.465088</td>\n",
" <td>3.499153</td>\n",
" <td>...</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>3.499153</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>4</td>\n",
" <td>4_lr=0.01,momentum=0.01</td>\n",
" <td>0.01</td>\n",
" <td>0.010</td>\n",
" <td>/Users/williamchambers/ray_results/TrainMNIST/...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" mean_accuracy done timesteps_total episodes_total training_iteration \\\n",
"11 0.881250 True NaN NaN 10 \n",
"8 0.862500 True NaN NaN 10 \n",
"2 0.853125 True NaN NaN 10 \n",
"5 0.837500 True NaN NaN 10 \n",
"4 0.768750 True NaN NaN 10 \n",
"\n",
" experiment_id date timestamp \\\n",
"11 57e8999280cb45ccb58932f5ffc490fb 2020-04-10_12-34-44 1586547284 \n",
"8 9c012f6a63254dcf84feaa437842d0d8 2020-04-10_12-34-44 1586547284 \n",
"2 3ee3a41a6d97443a8861775d1c0cf30e 2020-04-10_12-34-40 1586547280 \n",
"5 9f1f537e7a6a4857ab692f2e62670359 2020-04-10_12-34-40 1586547280 \n",
"4 a0f936131af7463ea0cad3ca3dc8f9e2 2020-04-10_12-34-40 1586547280 \n",
"\n",
" time_this_iter_s time_total_s ... hostname node_ip \\\n",
"11 0.219276 2.293020 ... billmp.lan 192.168.1.13 \n",
"8 0.219407 2.441065 ... billmp.lan 192.168.1.13 \n",
"2 0.661844 3.769403 ... billmp.lan 192.168.1.13 \n",
"5 0.712400 3.804112 ... billmp.lan 192.168.1.13 \n",
"4 0.465088 3.499153 ... billmp.lan 192.168.1.13 \n",
"\n",
" time_since_restore timesteps_since_restore iterations_since_restore \\\n",
"11 2.293020 0 10 \n",
"8 2.441065 0 10 \n",
"2 3.769403 0 10 \n",
"5 3.804112 0 10 \n",
"4 3.499153 0 10 \n",
"\n",
" trial_id experiment_tag config/lr config/momentum \\\n",
"11 11 11_lr=0.1,momentum=0.9 0.10 0.900 \n",
"8 8 8_lr=0.1,momentum=0.1 0.10 0.100 \n",
"2 2 2_lr=0.1,momentum=0.001 0.10 0.001 \n",
"5 5 5_lr=0.1,momentum=0.01 0.10 0.010 \n",
"4 4 4_lr=0.01,momentum=0.01 0.01 0.010 \n",
"\n",
" logdir \n",
"11 /Users/williamchambers/ray_results/TrainMNIST/... \n",
"8 /Users/williamchambers/ray_results/TrainMNIST/... \n",
"2 /Users/williamchambers/ray_results/TrainMNIST/... \n",
"5 /Users/williamchambers/ray_results/TrainMNIST/... \n",
"4 /Users/williamchambers/ray_results/TrainMNIST/... \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 110,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_values('mean_accuracy', ascending=False).head()"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.1/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 1/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/train_mnist<br>Number of trials: 12 (11 PENDING, 1 RUNNING)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>train_mnist_00000</td><td>RUNNING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>train_mnist_00001</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>train_mnist_00002</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td></tr>\n",
"<tr><td>train_mnist_00003</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>train_mnist_00004</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>train_mnist_00005</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td></tr>\n",
"<tr><td>train_mnist_00006</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>train_mnist_00007</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>train_mnist_00008</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td></tr>\n",
"<tr><td>train_mnist_00009</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"<tr><td>train_mnist_00010</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"<tr><td>train_mnist_00011</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Result for train_mnist_00000:\n",
" date: 2020-04-10_12-30-52\n",
" done: false\n",
" experiment_id: 481535da8ac64c22a40d6ebe5816aea1\n",
" experiment_tag: 0_lr=0.001,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.165625\n",
" node_ip: 192.168.1.13\n",
" pid: 22981\n",
" time_since_restore: 0.5230550765991211\n",
" time_this_iter_s: 0.5230550765991211\n",
" time_total_s: 0.5230550765991211\n",
" timestamp: 1586547052\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00000'\n",
" \n",
"Result for train_mnist_00002:\n",
" date: 2020-04-10_12-30-52\n",
" done: false\n",
" experiment_id: 1b8189b437c646de8623cb4db8dda0dc\n",
" experiment_tag: 2_lr=0.1,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.515625\n",
" node_ip: 192.168.1.13\n",
" pid: 22984\n",
" time_since_restore: 0.5195770263671875\n",
" time_this_iter_s: 0.5195770263671875\n",
" time_total_s: 0.5195770263671875\n",
" timestamp: 1586547052\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00002'\n",
" \n",
"Result for train_mnist_00001:\n",
" date: 2020-04-10_12-30-52\n",
" done: false\n",
" experiment_id: 8e84b35eeb2640dfbf6eb611c818b54f\n",
" experiment_tag: 1_lr=0.01,momentum=0.001\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.11875\n",
" node_ip: 192.168.1.13\n",
" pid: 22980\n",
" time_since_restore: 0.5445702075958252\n",
" time_this_iter_s: 0.5445702075958252\n",
" time_total_s: 0.5445702075958252\n",
" timestamp: 1586547052\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00001'\n",
" \n",
"Result for train_mnist_00005:\n",
" date: 2020-04-10_12-30-53\n",
" done: false\n",
" experiment_id: 65e4bd8f7f3e4689be98db0a3f0005b9\n",
" experiment_tag: 5_lr=0.1,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.390625\n",
" node_ip: 192.168.1.13\n",
" pid: 23016\n",
" time_since_restore: 0.357227087020874\n",
" time_this_iter_s: 0.357227087020874\n",
" time_total_s: 0.357227087020874\n",
" timestamp: 1586547053\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00005'\n",
" \n",
"Result for train_mnist_00007:\n",
" date: 2020-04-10_12-30-53\n",
" done: false\n",
" experiment_id: 1952564c89104e6cb900bf4cf75e445d\n",
" experiment_tag: 7_lr=0.01,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.103125\n",
" node_ip: 192.168.1.13\n",
" pid: 23018\n",
" time_since_restore: 0.40741610527038574\n",
" time_this_iter_s: 0.40741610527038574\n",
" time_total_s: 0.40741610527038574\n",
" timestamp: 1586547053\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00007'\n",
" \n",
"Result for train_mnist_00003:\n",
" date: 2020-04-10_12-30-53\n",
" done: false\n",
" experiment_id: 7c4a5139ee104ae8b3a8d131c02f48ad\n",
" experiment_tag: 3_lr=0.001,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.071875\n",
" node_ip: 192.168.1.13\n",
" pid: 23014\n",
" time_since_restore: 0.46332716941833496\n",
" time_this_iter_s: 0.46332716941833496\n",
" time_total_s: 0.46332716941833496\n",
" timestamp: 1586547053\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00003'\n",
" \n",
"Result for train_mnist_00004:\n",
" date: 2020-04-10_12-30-53\n",
" done: false\n",
" experiment_id: c53da9513ea740ef8a81b5b4384f4ac3\n",
" experiment_tag: 4_lr=0.01,momentum=0.01\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.09375\n",
" node_ip: 192.168.1.13\n",
" pid: 23015\n",
" time_since_restore: 0.5289340019226074\n",
" time_this_iter_s: 0.5289340019226074\n",
" time_total_s: 0.5289340019226074\n",
" timestamp: 1586547053\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00004'\n",
" \n",
"Result for train_mnist_00006:\n",
" date: 2020-04-10_12-30-54\n",
" done: false\n",
" experiment_id: ee9175d698074ae39aee8c6d72aa9968\n",
" experiment_tag: 6_lr=0.001,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.1\n",
" node_ip: 192.168.1.13\n",
" pid: 23017\n",
" time_since_restore: 0.611781120300293\n",
" time_this_iter_s: 0.611781120300293\n",
" time_total_s: 0.611781120300293\n",
" timestamp: 1586547054\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00006'\n",
" \n"
]
},
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.8/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 8/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/train_mnist<br>Number of trials: 12 (4 PENDING, 8 RUNNING)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th><th style=\"text-align: right;\"> acc</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>train_mnist_00000</td><td>RUNNING </td><td>192.168.1.13:22981</td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.178125</td><td style=\"text-align: right;\"> 8</td><td style=\"text-align: right;\"> 3.46004</td></tr>\n",
"<tr><td>train_mnist_00001</td><td>RUNNING </td><td>192.168.1.13:22980</td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.746875</td><td style=\"text-align: right;\"> 8</td><td style=\"text-align: right;\"> 3.51447</td></tr>\n",
"<tr><td>train_mnist_00002</td><td>RUNNING </td><td>192.168.1.13:22984</td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.88125 </td><td style=\"text-align: right;\"> 8</td><td style=\"text-align: right;\"> 3.40058</td></tr>\n",
"<tr><td>train_mnist_00003</td><td>RUNNING </td><td>192.168.1.13:23014</td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.103125</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 2.199 </td></tr>\n",
"<tr><td>train_mnist_00004</td><td>RUNNING </td><td>192.168.1.13:23015</td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.33125 </td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 2.34423</td></tr>\n",
"<tr><td>train_mnist_00005</td><td>RUNNING </td><td>192.168.1.13:23016</td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.6625 </td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 2.08736</td></tr>\n",
"<tr><td>train_mnist_00006</td><td>RUNNING </td><td>192.168.1.13:23017</td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.228125</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> 1.95937</td></tr>\n",
"<tr><td>train_mnist_00007</td><td>RUNNING </td><td>192.168.1.13:23018</td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.515625</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 2.15796</td></tr>\n",
"<tr><td>train_mnist_00008</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>train_mnist_00009</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>train_mnist_00010</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"<tr><td>train_mnist_00011</td><td>PENDING </td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td><td style=\"text-align: right;\"> </td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Result for train_mnist_00008:\n",
" date: 2020-04-10_12-30-57\n",
" done: false\n",
" experiment_id: 699bd7607e754533832ce576663b4253\n",
" experiment_tag: 8_lr=0.1,momentum=0.1\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.521875\n",
" node_ip: 192.168.1.13\n",
" pid: 23029\n",
" time_since_restore: 0.41529178619384766\n",
" time_this_iter_s: 0.41529178619384766\n",
" time_total_s: 0.41529178619384766\n",
" timestamp: 1586547057\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: 00008\n",
" \n",
"Result for train_mnist_00010:\n",
" date: 2020-04-10_12-30-57\n",
" done: false\n",
" experiment_id: a0c7de6bd33c4200b78ad6d2069317ed\n",
" experiment_tag: 10_lr=0.01,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.16875\n",
" node_ip: 192.168.1.13\n",
" pid: 23025\n",
" time_since_restore: 0.4081389904022217\n",
" time_this_iter_s: 0.4081389904022217\n",
" time_total_s: 0.4081389904022217\n",
" timestamp: 1586547057\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00010'\n",
" \n",
"Result for train_mnist_00009:\n",
" date: 2020-04-10_12-30-57\n",
" done: false\n",
" experiment_id: c78d631d02fa463a80f4d69990422fc9\n",
" experiment_tag: 9_lr=0.001,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.0875\n",
" node_ip: 192.168.1.13\n",
" pid: 23031\n",
" time_since_restore: 0.4857792854309082\n",
" time_this_iter_s: 0.4857792854309082\n",
" time_total_s: 0.4857792854309082\n",
" timestamp: 1586547057\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: 00009\n",
" \n",
"Result for train_mnist_00011:\n",
" date: 2020-04-10_12-30-58\n",
" done: false\n",
" experiment_id: de1cfc0d7aa64fc9bd74d616885e3506\n",
" experiment_tag: 11_lr=0.1,momentum=0.9\n",
" hostname: billmp.lan\n",
" iterations_since_restore: 1\n",
" mean_accuracy: 0.640625\n",
" node_ip: 192.168.1.13\n",
" pid: 23030\n",
" time_since_restore: 0.2651638984680176\n",
" time_this_iter_s: 0.2651638984680176\n",
" time_total_s: 0.2651638984680176\n",
" timestamp: 1586547058\n",
" timesteps_since_restore: 0\n",
" training_iteration: 0\n",
" trial_id: '00011'\n",
" \n"
]
},
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 5.3/8.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/2.49 GiB heap, 0.0/0.83 GiB objects<br>Result logdir: /Users/williamchambers/ray_results/train_mnist<br>Number of trials: 12 (12 TERMINATED)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> lr</th><th style=\"text-align: right;\"> momentum</th><th style=\"text-align: right;\"> acc</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>train_mnist_00000</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.19375 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.88122</td></tr>\n",
"<tr><td>train_mnist_00001</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.753125</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 4.02972</td></tr>\n",
"<tr><td>train_mnist_00002</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.001</td><td style=\"text-align: right;\">0.890625</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.81148</td></tr>\n",
"<tr><td>train_mnist_00003</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.11875 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.80629</td></tr>\n",
"<tr><td>train_mnist_00004</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.521875</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 4.09952</td></tr>\n",
"<tr><td>train_mnist_00005</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.01 </td><td style=\"text-align: right;\">0.890625</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.68201</td></tr>\n",
"<tr><td>train_mnist_00006</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.29375 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.95327</td></tr>\n",
"<tr><td>train_mnist_00007</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.68125 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 3.75076</td></tr>\n",
"<tr><td>train_mnist_00008</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.1 </td><td style=\"text-align: right;\">0.896875</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 2.49928</td></tr>\n",
"<tr><td>train_mnist_00009</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.001</td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.165625</td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 2.56056</td></tr>\n",
"<tr><td>train_mnist_00010</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.01 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.74375 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 2.52453</td></tr>\n",
"<tr><td>train_mnist_00011</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\">0.1 </td><td style=\"text-align: right;\"> 0.9 </td><td style=\"text-align: right;\">0.86875 </td><td style=\"text-align: right;\"> 9</td><td style=\"text-align: right;\"> 2.29106</td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"analysis = tune.run(train_mnist, config=conf)"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [],
"source": [
"# Get a dataframe for analyzing trial results.\n",
"df = analysis.dataframe()"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mean_accuracy</th>\n",
" <th>trial_id</th>\n",
" <th>training_iteration</th>\n",
" <th>time_this_iter_s</th>\n",
" <th>done</th>\n",
" <th>timesteps_total</th>\n",
" <th>episodes_total</th>\n",
" <th>experiment_id</th>\n",
" <th>date</th>\n",
" <th>timestamp</th>\n",
" <th>...</th>\n",
" <th>pid</th>\n",
" <th>hostname</th>\n",
" <th>node_ip</th>\n",
" <th>time_since_restore</th>\n",
" <th>timesteps_since_restore</th>\n",
" <th>iterations_since_restore</th>\n",
" <th>experiment_tag</th>\n",
" <th>config/lr</th>\n",
" <th>config/momentum</th>\n",
" <th>logdir</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>0.896875</td>\n",
" <td>8</td>\n",
" <td>9</td>\n",
" <td>0.226021</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>699bd7607e754533832ce576663b4253</td>\n",
" <td>2020-04-10_12-30-59</td>\n",
" <td>1586547059</td>\n",
" <td>...</td>\n",
" <td>23029</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>2.499276</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>8_lr=0.1,momentum=0.1</td>\n",
" <td>0.10</td>\n",
" <td>0.100</td>\n",
" <td>/Users/williamchambers/ray_results/train_mnist...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.890625</td>\n",
" <td>2</td>\n",
" <td>9</td>\n",
" <td>0.410900</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>1b8189b437c646de8623cb4db8dda0dc</td>\n",
" <td>2020-04-10_12-30-55</td>\n",
" <td>1586547055</td>\n",
" <td>...</td>\n",
" <td>22984</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>3.811480</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>2_lr=0.1,momentum=0.001</td>\n",
" <td>0.10</td>\n",
" <td>0.001</td>\n",
" <td>/Users/williamchambers/ray_results/train_mnist...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>0.890625</td>\n",
" <td>5</td>\n",
" <td>9</td>\n",
" <td>0.380373</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>65e4bd8f7f3e4689be98db0a3f0005b9</td>\n",
" <td>2020-04-10_12-30-57</td>\n",
" <td>1586547057</td>\n",
" <td>...</td>\n",
" <td>23016</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>3.682014</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>5_lr=0.1,momentum=0.01</td>\n",
" <td>0.10</td>\n",
" <td>0.010</td>\n",
" <td>/Users/williamchambers/ray_results/train_mnist...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>0.868750</td>\n",
" <td>11</td>\n",
" <td>9</td>\n",
" <td>0.186408</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>de1cfc0d7aa64fc9bd74d616885e3506</td>\n",
" <td>2020-04-10_12-31-00</td>\n",
" <td>1586547060</td>\n",
" <td>...</td>\n",
" <td>23030</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>2.291056</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>11_lr=0.1,momentum=0.9</td>\n",
" <td>0.10</td>\n",
" <td>0.900</td>\n",
" <td>/Users/williamchambers/ray_results/train_mnist...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.753125</td>\n",
" <td>1</td>\n",
" <td>9</td>\n",
" <td>0.515246</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>8e84b35eeb2640dfbf6eb611c818b54f</td>\n",
" <td>2020-04-10_12-30-56</td>\n",
" <td>1586547056</td>\n",
" <td>...</td>\n",
" <td>22980</td>\n",
" <td>billmp.lan</td>\n",
" <td>192.168.1.13</td>\n",
" <td>4.029720</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>1_lr=0.01,momentum=0.001</td>\n",
" <td>0.01</td>\n",
" <td>0.001</td>\n",
" <td>/Users/williamchambers/ray_results/train_mnist...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" mean_accuracy trial_id training_iteration time_this_iter_s done \\\n",
"8 0.896875 8 9 0.226021 False \n",
"2 0.890625 2 9 0.410900 False \n",
"5 0.890625 5 9 0.380373 False \n",
"11 0.868750 11 9 0.186408 False \n",
"1 0.753125 1 9 0.515246 False \n",
"\n",
" timesteps_total episodes_total experiment_id \\\n",
"8 NaN NaN 699bd7607e754533832ce576663b4253 \n",
"2 NaN NaN 1b8189b437c646de8623cb4db8dda0dc \n",
"5 NaN NaN 65e4bd8f7f3e4689be98db0a3f0005b9 \n",
"11 NaN NaN de1cfc0d7aa64fc9bd74d616885e3506 \n",
"1 NaN NaN 8e84b35eeb2640dfbf6eb611c818b54f \n",
"\n",
" date timestamp ... pid hostname node_ip \\\n",
"8 2020-04-10_12-30-59 1586547059 ... 23029 billmp.lan 192.168.1.13 \n",
"2 2020-04-10_12-30-55 1586547055 ... 22984 billmp.lan 192.168.1.13 \n",
"5 2020-04-10_12-30-57 1586547057 ... 23016 billmp.lan 192.168.1.13 \n",
"11 2020-04-10_12-31-00 1586547060 ... 23030 billmp.lan 192.168.1.13 \n",
"1 2020-04-10_12-30-56 1586547056 ... 22980 billmp.lan 192.168.1.13 \n",
"\n",
" time_since_restore timesteps_since_restore iterations_since_restore \\\n",
"8 2.499276 0 10 \n",
"2 3.811480 0 10 \n",
"5 3.682014 0 10 \n",
"11 2.291056 0 10 \n",
"1 4.029720 0 10 \n",
"\n",
" experiment_tag config/lr config/momentum \\\n",
"8 8_lr=0.1,momentum=0.1 0.10 0.100 \n",
"2 2_lr=0.1,momentum=0.001 0.10 0.001 \n",
"5 5_lr=0.1,momentum=0.01 0.10 0.010 \n",
"11 11_lr=0.1,momentum=0.9 0.10 0.900 \n",
"1 1_lr=0.01,momentum=0.001 0.01 0.001 \n",
"\n",
" logdir \n",
"8 /Users/williamchambers/ray_results/train_mnist... \n",
"2 /Users/williamchambers/ray_results/train_mnist... \n",
"5 /Users/williamchambers/ray_results/train_mnist... \n",
"11 /Users/williamchambers/ray_results/train_mnist... \n",
"1 /Users/williamchambers/ray_results/train_mnist... \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 103,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_values('mean_accuracy', ascending=False).head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"\n",
"In this example we learned about how to perform distributed hyperparameter tuning with RayTune. We took a sweep that we had to run locally and ran it in a distributed fashion with basically zero code changes. We learned about the different `tunable` types and how to manipulate them. See [the documentation for more information](https://ray.readthedocs.io/en/latest/tune.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment