Skip to content

Instantly share code, notes, and snippets.

@alibabadoufu
Created January 16, 2020 11:55
Show Gist options
  • Save alibabadoufu/ce6f7c64c5e43e0baa625c909fa7e1a7 to your computer and use it in GitHub Desktop.
Save alibabadoufu/ce6f7c64c5e43e0baa625c909fa7e1a7 to your computer and use it in GitHub Desktop.
create_graph_and_retain_graph.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "create_graph_and_retain_graph.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyP/6cfFXh9uK6U8mZvKAM/s",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/alibabadoufu/ce6f7c64c5e43e0baa625c909fa7e1a7/create_graph_and_retain_graph.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"metadata": {
"id": "tSVPRzvsDpoD",
"colab_type": "code",
"colab": {}
},
"source": [
"import torch\n",
"import torch.nn as nn"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "3j865i2oDruq",
"colab_type": "text"
},
"source": [
"Create a simple linear function"
]
},
{
"cell_type": "code",
"metadata": {
"id": "rmz-CTF_Dt8a",
"colab_type": "code",
"colab": {}
},
"source": [
"class LinearFunction(torch.nn.Module):\n",
" def __init__(self):\n",
" super(LinearFunction, self).__init__()\n",
" self.Linear = torch.nn.Linear(1,1, bias=False)\n",
" def forward(self, input):\n",
" output = self.Linear(input)\n",
" return output"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Fhrop6AWEW1L",
"colab_type": "text"
},
"source": [
"Create a simple input, a function and then backpropagate through the y: \n",
"\n",
"\\begin{equation}\n",
"y=(wx)^5\n",
"\\end{equation}"
]
},
{
"cell_type": "code",
"metadata": {
"id": "GUBkZyQlDzno",
"colab_type": "code",
"colab": {}
},
"source": [
"LF = LinearFunction()\n",
"\n",
"x = torch.tensor([1.0], requires_grad=True)\n",
"y = LF(x) ** 5\n",
"y.backward(create_graph=False, retain_graph=True)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "E7KWz1gGFTlQ",
"colab_type": "text"
},
"source": [
"**Theoritical Solution**\n",
"\n",
"\n",
"---\n",
"\n",
"\n",
"The first derivative of y with respect to x\n",
"\\begin{equation}\n",
"\\cfrac{dy}{dx} = 5*w^5*x^4\n",
"\\end{equation}\n",
"\n",
"\n",
"The second derivative of y with respect to x\n",
"\\begin{equation}\n",
"\\cfrac{d^2y}{dx^2} = 5*4*w^5*x^3\n",
"\\end{equation}\n",
"\n",
"Our objective is to get the second directive solution"
]
},
{
"cell_type": "code",
"metadata": {
"id": "2QghXSBdEvCQ",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
},
"outputId": "8e7fcf97-1185-4922-82c2-f7706a0aabbd"
},
"source": [
"w = LF.Linear.weight[0].data.numpy()\n",
"first_derivative = 5*(w**5)\n",
"second_derivative = 5*4*(w**5)\n",
"\n",
"print(f'1st backward')\n",
"print(f'Weight: {w}')\n",
"print(f'First Derivative of input x (through autograd): {x.grad.detach().numpy()}')\n",
"print(f'First Derivative of input x (theoritical solution): {first_derivative}')"
],
"execution_count": 22,
"outputs": [
{
"output_type": "stream",
"text": [
"1st backward\n",
"Weight: [-0.7040626]\n",
"First Derivative of input x (through autograd): [-0.8650204]\n",
"First Derivative of input x (theoritical solution): [-0.8650204]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "j1NzwMhcGtTW",
"colab_type": "text"
},
"source": [
"Here we see that the first derivative solution calculated using autograd matches with our derived theoritical solution"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9hrOwHC8H583",
"colab_type": "text"
},
"source": [
"Next we calculate the derivative of y w.r.t x after second backward pass and verify that if it equals to the theoritical second derivative solution"
]
},
{
"cell_type": "code",
"metadata": {
"id": "UBIrkES3HRBM",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
},
"outputId": "18635545-f99a-4420-bce5-2621dbd74013"
},
"source": [
"y.backward()\n",
"w = LF.Linear.weight[0].data.numpy()\n",
"\n",
"print(f'2nd backward')\n",
"print(f'Derivative of y w.r.t. input x after second backward pass (through autograd): {x.grad.detach().numpy()}')\n",
"print(f'Second Derivative of input x (theoritical solution): {second_derivative}')"
],
"execution_count": 23,
"outputs": [
{
"output_type": "stream",
"text": [
"2nd backward\n",
"Weight: [-0.7040626]\n",
"Derivative of y w.r.t. input x after second backward pass (through autograd): [-1.7300408]\n",
"Second Derivative of input x (theoritical solution): [-3.4600816]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gjgyXCHoJBYn",
"colab_type": "text"
},
"source": [
"They are not equal! It verifies our previous explanation that second backward pass does not actual result in second derivative of y w.r.t x! To get the second derivative, we need to activate create_graph argument inside backward() function.\n",
"\n",
"Here I recreate the LF function and pass 'true' to create_graph argument inside backward() function. Then I deliberately execute backward() function twice."
]
},
{
"cell_type": "code",
"metadata": {
"id": "IWO8iXOkJVeQ",
"colab_type": "code",
"colab": {}
},
"source": [
"LF = LinearFunction()\n",
"\n",
"x = torch.tensor([1.0], requires_grad=True)\n",
"y = LF(x) ** 5\n",
"y.backward(create_graph=True, retain_graph=True)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "d59wH7ikJ3F-",
"colab_type": "text"
},
"source": [
"To calculate second derivative w.r.t input x, we need to use autograd.grad() function. It functions similarly as .backward(), except that it allows us to calculate derivative w.r.t non-graph-leaf node."
]
},
{
"cell_type": "code",
"metadata": {
"id": "5zTWCug2Jl0s",
"colab_type": "code",
"colab": {}
},
"source": [
"grad_two = torch.autograd.grad(x.grad, x)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "A6Hst17jLTb6",
"colab_type": "text"
},
"source": [
"Since we have created a new linear function, we need to recalculate our theoretical second derivative solution"
]
},
{
"cell_type": "code",
"metadata": {
"id": "wQNczAMlLSLQ",
"colab_type": "code",
"colab": {}
},
"source": [
"w = LF.Linear.weight[0].data.numpy()\n",
"second_derivative = 5*4*(w**5)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "9XGbtmnKGiXF",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
},
"outputId": "001500dc-862a-4db8-9d9d-367b898f5a39"
},
"source": [
"print(f'2nd backward')\n",
"print(f'Derivative of y w.r.t. input x after second backward pass (through autograd): {grad_two[0].numpy()[0]}')\n",
"print(f'Second Derivative of input x (theoritical solution): {second_derivative}')"
],
"execution_count": 36,
"outputs": [
{
"output_type": "stream",
"text": [
"2nd backward\n",
"Derivative of y w.r.t. input x after second backward pass (through autograd): -0.016684751957654953\n",
"Second Derivative of input x (theoritical solution): [-0.01668475]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-al1D_duLiSu",
"colab_type": "text"
},
"source": [
"Finally, we have computed the second derivative using the pytorch autograd function!"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gDxQqE_rMfYK",
"colab_type": "text"
},
"source": [
"# Extra\n",
"\n",
"What if we do not activate create_graph?"
]
},
{
"cell_type": "code",
"metadata": {
"id": "gCDEFUlzMkGC",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 317
},
"outputId": "45c4d74a-c762-41d2-f838-d479f5e90f15"
},
"source": [
"LF = LinearFunction()\n",
"\n",
"x = torch.tensor([1.0], requires_grad=True)\n",
"y = LF(x) ** 5\n",
"y.backward(create_graph=False, retain_graph=True)\n",
"grad_two = torch.autograd.grad(x.grad, x)"
],
"execution_count": 38,
"outputs": [
{
"output_type": "error",
"ename": "RuntimeError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-38-dca1102ddf31>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mLF\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0;36m5\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcreate_graph\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mgrad_two\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mautograd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgrad\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgrad\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py\u001b[0m in \u001b[0;36mgrad\u001b[0;34m(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused)\u001b[0m\n\u001b[1;32m 155\u001b[0m return Variable._execution_engine.run_backward(\n\u001b[1;32m 156\u001b[0m \u001b[0moutputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgrad_outputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 157\u001b[0;31m inputs, allow_unused)\n\u001b[0m\u001b[1;32m 158\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 159\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mRuntimeError\u001b[0m: element 0 of tensors does not require grad and does not have a grad_fn"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "T3Aq24DJMryU",
"colab_type": "text"
},
"source": [
"Here you can see that if we do not activate create_graph, a message would pop up and warn us that x.grad does not require grad. In another word, we do not have x.grad inside the computation graph and hence, we can't do backward pass through this function. This would hinder us from finding the second derivative of y w.r.t to the x since we need to differentiate the x.grad to get the solution!"
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment