borisdayma/test_gradient_accumulation.ipynb

## test_gradient_accumulation.ipynb
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai2.vision.all import *\n",
    "\n",
    "import wandb\n",
    "from fastai2.callback.wandb import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Testing gradient accumulation\n",
    "\n",
    "We want to check the gradient accumulation callback by comparing following experiments:\n",
    "\n",
    "* batch size of 100 items, no gradient accumulation, learning rate of 1e-3\n",
    "* batch size of 10, gradient accumulation until reaching 100 items, learning rate of 1e-4 (since gradients are summed and not averaged)\n",
    "\n",
    "We use a `fit` loop to disregard effect of having more values sampled from schedulers. Also we don't use pre-trained models to have more training to visualize.\n",
    "\n",
    "We expect that both experiments should lead to same results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = untar_data(URLs.PETS)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pets = DataBlock(blocks = (ImageBlock, CategoryBlock),\n",
    "                 get_items=get_image_files, \n",
    "                 splitter=RandomSplitter(seed=42),\n",
    "                 get_y=using_attr(RegexLabeller(r'(.+)_\\d+.jpg$'), 'name'),\n",
    "                 item_tfms=Resize(460),\n",
    "                 batch_tfms=aug_transforms(size=224, min_scale=0.75))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### No gradient accumulation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bs = 100\n",
    "lr = 1e-2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dls = pets.dataloaders(path/\"images\", bs=bs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "wandb.init(project='gradient_accumulation')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=WandbCallback())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn.fit(20,lr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### With gradient accumulation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dls = pets.dataloaders(path/\"images\", bs=bs//10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Start a new experiment\n",
    "wandb.init(project='gradient_accumulation')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=[WandbCallback(), GradientAccumulation(bs)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn.fit(20,lr/10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
	{
	"cells": [
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"from fastai2.vision.all import *\n",
	"\n",
	"import wandb\n",
	"from fastai2.callback.wandb import *"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Testing gradient accumulation\n",
	"\n",
	"We want to check the gradient accumulation callback by comparing following experiments:\n",
	"\n",
	"* batch size of 100 items, no gradient accumulation, learning rate of 1e-3\n",
	"* batch size of 10, gradient accumulation until reaching 100 items, learning rate of 1e-4 (since gradients are summed and not averaged)\n",
	"\n",
	"We use a `fit` loop to disregard effect of having more values sampled from schedulers. Also we don't use pre-trained models to have more training to visualize.\n",
	"\n",
	"We expect that both experiments should lead to same results."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"path = untar_data(URLs.PETS)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"pets = DataBlock(blocks = (ImageBlock, CategoryBlock),\n",
	" get_items=get_image_files, \n",
	" splitter=RandomSplitter(seed=42),\n",
	" get_y=using_attr(RegexLabeller(r'(.+)_\\d+.jpg$'), 'name'),\n",
	" item_tfms=Resize(460),\n",
	" batch_tfms=aug_transforms(size=224, min_scale=0.75))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### No gradient accumulation"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"bs = 100\n",
	"lr = 1e-2"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"dls = pets.dataloaders(path/\"images\", bs=bs)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"wandb.init(project='gradient_accumulation')"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=WandbCallback())"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"learn.fit(20,lr)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### With gradient accumulation"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"dls = pets.dataloaders(path/\"images\", bs=bs//10)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"# Start a new experiment\n",
	"wandb.init(project='gradient_accumulation')"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"learn = cnn_learner(dls, resnet18, metrics=error_rate, pretrained=False, cbs=[WandbCallback(), GradientAccumulation(bs)])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"learn.fit(20,lr/10)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.1"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 4
	}