Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save aicrowd-bot/f05fcaa88b69d09fa3fe4231a84cfa27 to your computer and use it in GitHub Desktop.
Save aicrowd-bot/f05fcaa88b69d09fa3fe4231a84cfa27 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{"nbformat":4,"nbformat_minor":0,"metadata":{"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.5-final"},"orig_nbformat":2,"kernelspec":{"name":"python3","display_name":"Python 3.8.5 64-bit ('base': conda)","metadata":{"interpreter":{"hash":"ff00dac108179d3c1c82d8e6cb9ecf4e9dc3d3f409cd941b5c5be41e7679c1ce"}}},"colab":{"name":"IITM_RL_VI_v1.ipynb","provenance":[{"file_id":"1oQJwhRpgZ5NnXfmhDKvGSvTdYBur6AqA","timestamp":1614727287410},{"file_id":"13dsPOr75r8LJRTgZrSEpaY-LCPl2F4F-","timestamp":1614674636514}],"collapsed_sections":[],"toc_visible":true}},"cells":[{"cell_type":"markdown","metadata":{"id":"xMOtbrHSzC1r"},"source":["<div style=\"text-align: center\">\n"," <a href=\"https://www.aicrowd.com/challenges/rl-vi\"><img alt=\"AIcrowd\" src=\"https://images.aicrowd.com/raw_images/challenges/banner_file/754/3fc6598e270b9219e215.jpg\"></a>\n","</div>"]},{"cell_type":"markdown","metadata":{"id":"WXuhHGatRVNf"},"source":["# What is the notebook about?\n","\n","## Problem - Value Iteration\n","This problem deals with a grid world and stochastic actions. The tasks you have to do are:\n","- Complete the Environment\n","- Write code for value Iteration\n","- Visualize Results\n","- Explain the results\n","\n","## How to use this notebook? ๐Ÿ“\n","\n","- This is a shared template and any edits you make here will not be saved.**You\n","should make a copy in your own drive**. Click the \"File\" menu (top-left), then \"Save a Copy in Drive\". You will be working in your copy however you like.\n","\n","- **Update the config parameters**. You can define the common variables here\n","\n","Variable | Description\n","--- | ---\n","`AICROWD_DATASET_PATH` | Path to the file containing test data. This should be an absolute path.\n","`AICROWD_RESULTS_DIR` | Path to write the output to.\n","`AICROWD_ASSETS_DIR` | In case your notebook needs additional files (like model weights, etc.,), you can add them to a directory and specify the path to the directory here (please specify relative path). The contents of this directory will be sent to AIcrowd for evaluation.\n","`AICROWD_API_KEY` | In order to submit your code to AIcrowd, you need to provide your account's API key. This key is available at https://www.aicrowd.com/participants/me\n","\n","- **Installing packages**. Please use the [Install packages ๐Ÿ—ƒ](#install-packages-) section to install the packages"]},{"cell_type":"markdown","metadata":{"id":"1gzlHU3wRQRj"},"source":["# Setup AIcrowd Utilities ๐Ÿ› \n","\n","We use this to bundle the files for submission and create a submission on AIcrowd. Do not edit this block."]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"FMKvhbEKiFH5","executionInfo":{"status":"ok","timestamp":1614732144741,"user_tz":-330,"elapsed":9937,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"695b85d8-bacc-4d76-e0d1-eb57438058be"},"source":["!pip install -U git+https://gitlab.aicrowd.com/aicrowd/aicrowd-cli.git@notebook-submission-v2 > /dev/null "],"execution_count":null,"outputs":[{"output_type":"stream","text":[" Running command git clone -q https://gitlab.aicrowd.com/aicrowd/aicrowd-cli.git /tmp/pip-req-build-f5ev461k\n"," Running command git checkout -b notebook-submission-v2 --track origin/notebook-submission-v2\n"," Switched to a new branch 'notebook-submission-v2'\n"," Branch 'notebook-submission-v2' set up to track remote branch 'notebook-submission-v2' from 'origin'.\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"_CxF1F4AiHgp"},"source":["%load_ext aicrowd.magic "],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"oJuQ_nzaRpp6"},"source":["# AIcrowd Runtime Configuration ๐Ÿงท\n","\n","Define configuration parameters."]},{"cell_type":"code","metadata":{"id":"uBKm1IuliJc7"},"source":["import os\n","\n","AICROWD_DATASET_PATH = os.getenv(\"DATASET_PATH\", os.getcwd()+\"/9db39385-0a4b-47db-8d20-fffb0480e47e_hw2_q2.zip\")\n","AICROWD_RESULTS_DIR = os.getenv(\"OUTPUTS_DIR\", \"results\")\n","API_KEY = \"\" # Get your key from https://www.aicrowd.com/participants/me (ctrl + click the link)"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"RQnfZ7VYidPC","executionInfo":{"status":"ok","timestamp":1614732192787,"user_tz":-330,"elapsed":6014,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"1394253f-73e1-4e8f-a5e3-efd21b6cea19"},"source":["!aicrowd login --api-key $API_KEY\n","!aicrowd dataset download -c rl-vi"],"execution_count":null,"outputs":[{"output_type":"stream","text":["\u001b[32mAPI Key valid\u001b[0m\n","\u001b[32mSaved API Key successfully!\u001b[0m\n","9db39385-0a4b-47db-8d20-fffb0480e47e_hw2_q2.zip: 100% 3.08k/3.08k [00:00<00:00, 158kB/s]\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"RHqevxIX_4fo","executionInfo":{"status":"ok","timestamp":1614732196591,"user_tz":-330,"elapsed":1101,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"af453ad3-c418-4dcc-bfb7-9ef4cc3a01d7"},"source":["!unzip $AICROWD_DATASET_PATH"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Archive: /content/9db39385-0a4b-47db-8d20-fffb0480e47e_hw2_q2.zip\n"," creating: hw2_q2/\n"," creating: hw2_q2/targets/\n"," inflating: hw2_q2/targets/targets_0.npy \n"," creating: hw2_q2/sample_results/\n"," inflating: hw2_q2/sample_results/sample_results_0.npy \n"," creating: hw2_q2/inputs/\n"," inflating: hw2_q2/inputs/env_params_0.npy \n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"gm46inU6ABu-"},"source":["DATASET_DIR = 'hw2_q2/'\n","!mkdir {DATASET_DIR}results/"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"gTeWFlWukTob"},"source":["# Install packages ๐Ÿ—ƒ\n","\n","Please add all package installations in this section"]},{"cell_type":"code","metadata":{"id":"cusnfjYFRVNm"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"TxdOqsVGRXKg"},"source":["# Import packages ๐Ÿ’ป"]},{"cell_type":"code","metadata":{"id":"lmhJUSs2Rjvh"},"source":["import numpy as np\n","import matplotlib.pyplot as plt \n","import os\n","# ADD ANY IMPORTS YOU WANT HERE\n"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"XP-1yA7URVNn"},"source":["# Task 0 - Complete the environment\n","You need to complete part of the environment which calculates the possible next states, their probabilities, and the reward."]},{"cell_type":"code","metadata":{"id":"GumM6eyWRVNn"},"source":["class GridEnv_HW2:\n"," def __init__(self, \n"," goal_location, \n"," action_stochasticity,\n"," non_terminal_reward,\n"," terminal_reward,\n"," grey_in,\n"," brown_in,\n"," grey_out,\n"," brown_out\n"," ):\n","\n"," # Do not edit this section \n"," self.action_stochasticity = action_stochasticity\n"," self.non_terminal_reward = non_terminal_reward\n"," self.terminal_reward = terminal_reward\n"," self.grid_size = [10, 10]\n","\n"," # Index of the actions \n"," self.actions = {'N': (-1, 0), \n"," 'E': (0,1), \n"," 'S': (1,0), \n"," 'W': (0,-1)}\n"," \n"," self.perpendicular_order = ['N', 'E', 'S', 'W']\n"," \n"," l = ['normal' for _ in range(self.grid_size[0]) ]\n"," self.grid = np.array([l for _ in range(self.grid_size[1]) ], dtype=object)\n","\n"," self.grid[goal_location[0], goal_location[1]] = 'goal'\n"," self.goal_location = goal_location\n","\n"," for gi in grey_in:\n"," self.grid[gi[0],gi[1]] = 'grey_in'\n"," for bi in brown_in: \n"," self.grid[bi[0], bi[1]] = 'brown_in'\n","\n"," self.grey_out = go = grey_out\n"," self.brown_out = bo = brown_out\n","\n"," self.grid[go[0], go[1]] = 'grey_out'\n"," self.grid[bo[0], bo[1]] = 'brown_out'\n"," \n"," self.states_sanity_check()\n"," \n"," def states_sanity_check(self):\n"," \"\"\" Implement to prevent cases where the goal gets overwritten etc \"\"\"\n"," pass\n","\n"," def visualize_grid(self):\n"," pass\n","\n"," def _out_of_grid(self, state):\n"," if state[0] < 0 or state[1] < 0:\n"," return True\n"," elif state[0] > self.grid_size[0] - 1:\n"," return True\n"," elif state[1] > self.grid_size[1] - 1:\n"," return True\n"," else:\n"," return False\n","\n"," def _grid_state(self, state):\n"," return self.grid[state[0], state[1]] \n"," \n"," def get_transition_probabilites_and_reward(self, state, action):\n"," \"\"\" \n"," Returns the probabiltity of all possible transitions for the given action in the form:\n"," A list of tuples of (next_state, probability, reward)\n"," Note that based on number of state and action there can be many different next states\n"," Unless the state is All the probabilities of next states should add up to 1\n"," \"\"\"\n","\n"," grid_state = self._grid_state(state)\n"," \n"," if grid_state == 'goal':\n"," return [(self.goal_location, 1.0, 0.0)]\n"," elif grid_state == 'grey_in':\n"," return [(self.grey_out, 1.0, self.non_terminal_reward)]\n"," elif grid_state == 'brown_in':\n"," return [(self.brown_out, 1.0, self.non_terminal_reward)]\n"," \n"," direction = self.actions.get(action, None)\n"," if direction is None:\n"," raise ValueError(\"Invalid action %s , please select among\" % action, list(self.actions.keys()))\n","\n"," nextstates_prob_rews = []\n","\n"," # TASK 0 - Complete the environment\n","\n"," # ADD YOUR CODE BELOW - DO NOT EDIT ABOVE THIS LINE\n","\n"," # Hints: \n"," # Get access to all actions with self.actions\n"," # Use self.action_stochasticity for the probabilities of the other action\n"," # The array will have probabilities for [0, 90, 180, -90] degrees\n"," # So self.action_stochasticity = [0.8, 0.1, 0.0, 0.1] means 0.8 for forward and 0.1 for left and right\n"," # Remember that you need to return a list of tuples with the form (next_state, probability, reward)\n"," # If you have 3 possible next states, you should return [(ns1, p1, r1), (ns2, p2, r2), (ns3, p3, r3)]\n"," # Use the helper function self._out_of_grid to check if any state is outside the grid\n"," \n"," # Important Note:\n"," # Do not hard code any state locations, they may be changed in the submissions\n","\n","\n"," # DO NOT EDIT BELOW THIS LINE\n","\n"," return nextstates_prob_rews"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vVL7B0cjSZyy"},"source":["# Question - When do you decide to stop value iteration \n","Modify this cell and add you answer"]},{"cell_type":"markdown","metadata":{"id":"AD3Jgj2JRVNp"},"source":["# Task 1"]},{"cell_type":"markdown","metadata":{"id":"GIDOS51MWPU9"},"source":["## a) Implement Value iteration"]},{"cell_type":"code","metadata":{"tags":[],"id":"45p2j33NRVNp"},"source":["def value_iter(env):\n"," value_grid = np.zeros((10, 10)) # Marked as J(s) in the homework pdf\n"," policy = np.zeros((10, 10), np.int32) # Marked as pi(s) in homework pdf\n","\n"," value_grids = [] # Store all the J(s) grids at every iteration in this list\n"," policies = [] # Store all the pi(s) grids at every iteration in this list\n","\n"," # ADD YOUR CODE BELOW - DO NOT EDIT ABOVE THIS LINE\n"," \n","\n"," # DO NOT EDIT BELOW THIS LINE\n"," results = {\"value_grid\": value_grid, \"pi_s\": policy}\n"," return results, value_grids, policies"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"mus58g6PCx0I"},"source":["## Here is an example of what the \"results\" output from value_iter function should look like\n","\n","Ofcourse, it won't be all zeros\n","\n","``` python \n","{'value_grid': array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n"," [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]),\n"," 'pi_s': array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int32)}\n","```"]},{"cell_type":"code","metadata":{"id":"vLF4WYcrAx1r","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1614732208718,"user_tz":-330,"elapsed":987,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"9fc2c5e0-eae0-4bbc-965b-7a136220f1bd"},"source":["# DO NOT EDIT THIS CELL, DURING EVALUATION THE DATASET DIR WILL CHANGE\n","!mkdir $AICROWD_RESULTS_DIR\n","input_dir = os.path.join(DATASET_DIR, 'inputs')\n","for params_file in os.listdir(input_dir):\n"," kwargs = np.load(os.path.join(input_dir, params_file), allow_pickle=True).item()\n"," env = GridEnv_HW2(**kwargs)\n"," results, value_grids, policies = value_iter(env)\n"," idx = params_file.split('_')[-1][:-4]\n"," np.save(os.path.join(AICROWD_RESULTS_DIR, 'results_' + idx), results)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["mkdir: cannot create directory โ€˜resultsโ€™: File exists\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"S8IVf3N9VuNZ"},"source":["#Task 2 - The value iteration loop goes to infinity (refer the pseudocode given above), so when would you stop your value iteration?"]},{"cell_type":"markdown","metadata":{"id":"ahKp_WLoYFN_"},"source":["Modify this cell and add you answer"]},{"cell_type":"markdown","metadata":{"id":"K2r-zLDhRVNp"},"source":["# Task 3- Plot graph of $||J_{i+1}(s) - J_i(s)||$\n","An example plot code is provided below, but you can change it if you want"]},{"cell_type":"code","metadata":{"id":"ZFXPt6m6RVNq","colab":{"base_uri":"https://localhost:8080/","height":282},"executionInfo":{"status":"ok","timestamp":1614732211571,"user_tz":-330,"elapsed":1205,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"a9aa159f-d5e0-47a1-aee5-58328c7441e0"},"source":["import matplotlib.pyplot as plt\n","diffs = []\n","for ii in range(len(value_grids)-1):\n"," diff = np.linalg.norm(value_grids[ii+1] - value_grids[ii]) \n"," diffs.append(diff)\n","plt.plot(diffs)"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["[<matplotlib.lines.Line2D at 0x7f399f020510>]"]},"metadata":{"tags":[]},"execution_count":19},{"output_type":"display_data","data":{"image/png":"iVBORw0KGgoAAAANSUhEUgAAAYIAAAD4CAYAAADhNOGaAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAOpklEQVR4nO3cf6jd9X3H8eeruTRrEUyi8UeN2bVVGHGDFg5K2QauaoyDNtL6h90fDVtL/lj9Y5VCUxzT2v6hbp2ltNsIbSEIa3SO0kApEm2FMYb1xDrarE1zjS0mVZuaIDipkvW9P+7X7Xg5Mffec+49OX6eDzjc8/1+P/fe98cLeeac742pKiRJ7XrbpAeQJE2WIZCkxhkCSWqcIZCkxhkCSWrczKQHWI7zzz+/ZmdnJz2GJE2VAwcO/LqqNi48P5UhmJ2dpd/vT3oMSZoqSX4x7LxvDUlS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS48YSgiTbkhxKMpdk15Dra5M80F1/PMnsguubk7yc5NPjmEeStHgjhyDJGuCrwI3AFuCjSbYsWPZx4GRVXQ7cB9yz4PrfA98ddRZJ0tKN4xXBVcBcVR2pqteAvcD2BWu2A3u65w8B1yYJQJKbgGeAg2OYRZK0ROMIwSXAswPHR7tzQ9dU1SngJeC8JOcAnwE+d6ZvkmRnkn6S/vHjx8cwtiQJJn+z+E7gvqp6+UwLq2p3VfWqqrdx48aVn0ySGjEzhq9xDLh04HhTd27YmqNJZoBzgReBq4Gbk9wLrAN+m+Q3VfWVMcwlSVqEcYTgCeCKJJcx/wf+LcCfLVizD9gB/AdwM/C9qirgj19fkORO4GUjIEmra+QQVNWpJLcCDwNrgG9U1cEkdwH9qtoHfB24P8kccIL5WEiSzgKZ/4v5dOn1etXv9yc9hiRNlSQHqqq38PykbxZLkibMEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS4wyBJDXOEEhS48YSgiTbkhxKMpdk15Dra5M80F1/PMlsd/76JAeS/Kj7+IFxzCNJWryRQ5BkDfBV4EZgC/DRJFsWLPs4cLKqLgfuA+7pzv8a+GBV/QGwA7h/1HkkSUszjlcEVwFzVXWkql4D9gLbF6zZDuzpnj8EXJskVfXDqvpld/4g8I4ka8cwkyRpkcYRgkuAZweOj3bnhq6pqlPAS8B5C9Z8BHiyql4dw0ySpEWamfQAAEmuZP7toq1vsmYnsBNg8+bNqzSZJL31jeMVwTHg0oHjTd25oWuSzADnAi92x5uAbwEfq6qnT/dNqmp3VfWqqrdx48YxjC1JgvGE4AngiiSXJXk7cAuwb8GafczfDAa4GfheVVWSdcB3gF1V9e9jmEWStEQjh6B7z/9W4GHgJ8CDVXUwyV1JPtQt+zpwXpI54Dbg9V8xvRW4HPibJE91jwtGnUmStHipqknPsGS9Xq/6/f6kx5CkqZLkQFX1Fp73XxZLUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuMMgSQ1zhBIUuPGEoIk25IcSjKXZNeQ62uTPNBdfzzJ7MC1z3bnDyW5YRzzSJIWb+QQJFkDfBW4EdgCfDTJlgXLPg6crKrLgfuAe7rP3QLcAlwJbAP+oft6kqRVMo5XBFcBc1V1pKpeA/YC2xes2Q7s6Z4/BFybJN35vVX1alU9A8x1X0+StErGEYJLgGcHjo9254auqapTwEvAeYv8XACS7EzST9I/fvz4GMaWJMEU3Syuqt1V1auq3saNGyc9jiS9ZYwjBMeASweON3Xnhq5JMgOcC7y4yM+VJK2gcYTgCeCKJJcleTvzN3/3LVizD9jRPb8Z+F5VVXf+lu63ii4DrgB+MIaZJEmLNDPqF6iqU0luBR4G1gDfqKqDSe4C+lW1D/g6cH+SOeAE87GgW/cg8F/AKeCTVfU/o84kSVq8zP/FfLr0er3q9/uTHkOSpkqSA1XVW3h+am4WS5JWhiGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMaNFIIkG5LsT3K4+7j+NOt2dGsOJ9nRnXtnku8k+WmSg0nuHmUWSdLyjPqKYBfwaFVdATzaHb9Bkg3AHcDVwFXAHQPB+Luq+j3gfcAfJrlxxHkkSUs0agi2A3u653uAm4asuQHYX1UnquoksB/YVlWvVNX3AarqNeBJYNOI80iSlmjUEFxYVc91z58HLhyy5hLg2YHjo925/5NkHfBB5l9VSJJW0cyZFiR5BLhoyKXbBw+qqpLUUgdIMgN8E/hyVR15k3U7gZ0AmzdvXuq3kSSdxhlDUFXXne5akheSXFxVzyW5GPjVkGXHgGsGjjcBjw0c7wYOV9WXzjDH7m4tvV5vycGRJA036ltD+4Ad3fMdwLeHrHkY2JpkfXeTeGt3jiRfAM4F/mrEOSRJyzRqCO4Grk9yGLiuOyZJL8nXAKrqBPB54InucVdVnUiyifm3l7YATyZ5KsknRpxHkrREqZq+d1l6vV71+/1JjyFJUyXJgarqLTzvvyyWpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMYZAklqnCGQpMaNFIIkG5LsT3K4+7j+NOt2dGsOJ9kx5Pq+JD8eZRZJ0vKM+opgF/BoVV0BPNodv0GSDcAdwNXAVcAdg8FI8mHg5RHnkCQt06gh2A7s6Z7vAW4asuYGYH9Vnaiqk8B+YBtAknOA24AvjDiHJGmZRg3BhVX1XPf8eeDCIWsuAZ4dOD7anQP4PPBF4JUzfaMkO5P0k/SPHz8+wsiSpEEzZ1qQ5BHgoiGXbh88qKpKUov9xkneC7ynqj6VZPZM66tqN7AboNfrLfr7SJLe3BlDUFXXne5akheSXFxVzyW5GPjVkGXHgGsGjjcBjwHvB3pJft7NcUGSx6rqGiRJq2bUt4b2Aa//FtAO4NtD1jwMbE2yvrtJvBV4uKr+sareVVWzwB8BPzMCkrT6Rg3B3cD1SQ4D13XHJOkl+RpAVZ1g/l7AE93jru6cJOkskKrpe7u91+tVv9+f9BiSNFWSHKiq3sLz/stiSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxhkCSWqcIZCkxqWqJj3DkiU5Dvxi0nMs0fnAryc9xCpzz21wz9Pjd6tq48KTUxmCaZSkX1W9Sc+xmtxzG9zz9POtIUlqnCGQpMYZgtWze9IDTIB7boN7nnLeI5CkxvmKQJIaZwgkqXGGYIySbEiyP8nh7uP606zb0a05nGTHkOv7kvx45Sce3Sh7TvLOJN9J8tMkB5PcvbrTL02SbUkOJZlLsmvI9bVJHuiuP55kduDaZ7vzh5LcsJpzj2K5e05yfZIDSX7UffzAas++HKP8jLvrm5O8nOTTqzXzWFSVjzE9gHuBXd3zXcA9Q9ZsAI50H9d3z9cPXP8w8M/Ajye9n5XeM/BO4E+6NW8H/g24cdJ7Os0+1wBPA+/uZv1PYMuCNX8J/FP3/Bbgge75lm79WuCy7uusmfSeVnjP7wPe1T3/feDYpPezkvsduP4Q8C/Apye9n6U8fEUwXtuBPd3zPcBNQ9bcAOyvqhNVdRLYD2wDSHIOcBvwhVWYdVyWveeqeqWqvg9QVa8BTwKbVmHm5bgKmKuqI92se5nf+6DB/xYPAdcmSXd+b1W9WlXPAHPd1zvbLXvPVfXDqvpld/4g8I4ka1dl6uUb5WdMkpuAZ5jf71QxBON1YVU91z1/HrhwyJpLgGcHjo925wA+D3wReGXFJhy/UfcMQJJ1wAeBR1diyDE44x4G11TVKeAl4LxFfu7ZaJQ9D/oI8GRVvbpCc47Lsvfb/SXuM8DnVmHOsZuZ9ADTJskjwEVDLt0+eFBVlWTRv5ub5L3Ae6rqUwvfd5y0ldrzwNefAb4JfLmqjixvSp2NklwJ3ANsnfQsK+xO4L6qerl7gTBVDMESVdV1p7uW5IUkF1fVc0kuBn41ZNkx4JqB403AY8D7gV6SnzP/c7kgyWNVdQ0TtoJ7ft1u4HBVfWkM466UY8ClA8ebunPD1hzt4nYu8OIiP/dsNMqeSbIJ+Bbwsap6euXHHdko+70auDnJvcA64LdJflNVX1n5scdg0jcp3koP4G95443Te4es2cD8+4jru8czwIYFa2aZnpvFI+2Z+fsh/wq8bdJ7OcM+Z5i/yX0Z/38j8coFaz7JG28kPtg9v5I33iw+wnTcLB5lz+u69R+e9D5WY78L1tzJlN0snvgAb6UH8++NPgocBh4Z+MOuB3xtYN1fMH/DcA748yFfZ5pCsOw9M/83rgJ+AjzVPT4x6T29yV7/FPgZ879Zcnt37i7gQ93z32H+N0bmgB8A7x743Nu7zzvEWfqbUePcM/DXwH8P/FyfAi6Y9H5W8mc88DWmLgT+LyYkqXH+1pAkNc4QSFLjDIEkNc4QSFLjDIEkNc4QSFLjDIEkNe5/AecL/ch2b2HBAAAAAElFTkSuQmCC\n","text/plain":["<Figure size 432x288 with 1 Axes>"]},"metadata":{"tags":[],"needs_background":"light"}}]},{"cell_type":"markdown","metadata":{"id":"tsEaK9L4RoZd"},"source":["# Task 4 - Show $J(s)$ and $pi(s)$ after 10, 25 and final iteration. "]},{"cell_type":"code","metadata":{"id":"z0jPnRLvR-DF"},"source":["# Use any visualization code you want to show the value and policies"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"NaE7G1FOSLhD"},"source":["# Task 5 - Consider a new gridworld (GridWorld-2) as shown Figure 2 (GridWorld-2 differ from GridWorld-1 only in the position of the โ€œGoalโ€ state). Compare and contrast the behavior of J and greedy policy ฯ€ for GridWorld-1 and GridWorld-2\n"]},{"cell_type":"markdown","metadata":{"id":"Cyik1NmZYC5s"},"source":["Modify this cell and add you answer"]},{"cell_type":"markdown","metadata":{"id":"8pNWylwjjCT1"},"source":["# Submit to AIcrowd ๐Ÿš€"]},{"cell_type":"code","metadata":{"id":"AcCElHZnjDbI","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1614732272020,"user_tz":-330,"elapsed":58803,"user":{"displayName":"Ayush Shivani","photoUrl":"","userId":"00065643036033516558"}},"outputId":"e0a0a258-27fe-404d-cfbb-d78dfb2d87d4"},"source":["!DATASET_PATH=$AICROWD_DATASET_PATH aicrowd notebook submit -c rl-vi -a assets"],"execution_count":null,"outputs":[{"output_type":"stream","text":["WARNING: No assets directory at assets... Creating one...\n","Using notebook: /content/IITM_RL_VI_v1.ipynb for submission...\n","\u001b[1;34mMounting Google Drive ๐Ÿ’พ\u001b[0m\n","Your Google Drive will be mounted to access the colab notebook\n","Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.activity.readonly&response_type=code\n","\n","Enter your authorization code:\n","4/1AY0e-g4F0zgdg82Y_lOR210cOIPyWtCOKWE3l7pRpdkzUm7Ana8FiFdZ0Pg\n","Mounted at /content/drive\n","Scrubbing API keys from the notebook...\n","Collecting notebook...\n","Validating the submission...\n","Executing install.ipynb...\n","[NbConvertApp] Converting notebook /content/submission/install.ipynb to notebook\n","[NbConvertApp] Executing notebook with kernel: python3\n","[NbConvertApp] Writing 1491 bytes to /content/submission/install.nbconvert.ipynb\n","Executing predict.ipynb...\n","[NbConvertApp] Converting notebook /content/submission/predict.ipynb to notebook\n","[NbConvertApp] Executing notebook with kernel: python3\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] ERROR | unhandled iopub msg: colab_request\n","[NbConvertApp] Writing 25658 bytes to /content/submission/predict.nbconvert.ipynb\n","\u001b[2K\u001b[1;34msubmission.zip\u001b[0m \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[35m100.0%\u001b[0m โ€ข \u001b[32m30.7/29.0 KB\u001b[0m โ€ข \u001b[31m2.5 MB/s\u001b[0m โ€ข \u001b[36m0:00:00\u001b[0m\n","\u001b[?25h โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ \n"," โ”‚ \u001b[1mSuccessfully submitted!\u001b[0m โ”‚ \n"," โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ \n","\u001b[3m Important links \u001b[0m\n","โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”\n","โ”‚ This submission โ”‚ https://www.aicrowd.com/challenges/rliitm-1/submissions/124371 โ”‚\n","โ”‚ โ”‚ โ”‚\n","โ”‚ All submissions โ”‚ https://www.aicrowd.com/challenges/rliitm-1/submissions?my_submissions=true โ”‚\n","โ”‚ โ”‚ โ”‚\n","โ”‚ Leaderboard โ”‚ https://www.aicrowd.com/challenges/rliitm-1/leaderboards โ”‚\n","โ”‚ โ”‚ โ”‚\n","โ”‚ Discussion forum โ”‚ https://discourse.aicrowd.com/c/rliitm-1 โ”‚\n","โ”‚ โ”‚ โ”‚\n","โ”‚ Challenge page โ”‚ https://www.aicrowd.com/challenges/rliitm-1 โ”‚\n","โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"AQIQi6-bMH8Q"},"source":[""],"execution_count":null,"outputs":[]}]}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment