Skip to content

Instantly share code, notes, and snippets.

@borisdayma
Created April 7, 2020 04:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save borisdayma/a31eb4091bd799d0730620c01a4241e5 to your computer and use it in GitHub Desktop.
Save borisdayma/a31eb4091bd799d0730620c01a4241e5 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from fastai2.vision.all import *\n",
"\n",
"import wandb\n",
"from fastai2.callback.wandb import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Testing gradient accumulation\n",
"\n",
"We want to check the gradient accumulation callback by comparing following experiments:\n",
"\n",
"* batch size of 100 items, no gradient accumulation, learning rate of 1e-3\n",
"* batch size of 10, gradient accumulation until reaching 100 items, learning rate of 1e-4 (since gradients are summed and not averaged)\n",
"\n",
"We use a `fit` loop to disregard effect of having more values sampled from schedulers. Also we don't use pre-trained models to have more training to visualize.\n",
"\n",
"We expect that both experiments should lead to same results."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"path = untar_data(URLs.PETS)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pets = DataBlock(blocks = (ImageBlock, CategoryBlock),\n",
" get_items=get_image_files, \n",
" splitter=RandomSplitter(seed=42),\n",
" get_y=using_attr(RegexLabeller(r'(.+)_\\d+.jpg$'), 'name'),\n",
" item_tfms=Resize(460),\n",
" batch_tfms=aug_transforms(size=224, min_scale=0.75))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### No gradient accumulation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bs = 100\n",
"lr = 1e-2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dls = pets.dataloaders(path/\"images\", bs=bs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"wandb.init(project='gradient_accumulation')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=WandbCallback())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learn.fit(20,lr)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### With gradient accumulation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dls = pets.dataloaders(path/\"images\", bs=bs//10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Start a new experiment\n",
"wandb.init(project='gradient_accumulation')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=[WandbCallback(), GradientAccumulation(bs)])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learn.fit(20,lr/10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment