Skip to content

Instantly share code, notes, and snippets.

@johnleung8888
Created January 17, 2023 18:11
Show Gist options
  • Save johnleung8888/1b6afa6ea9c4d2925a41675c2179c749 to your computer and use it in GitHub Desktop.
Save johnleung8888/1b6afa6ea9c4d2925a41675c2179c749 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "456971b7c32e2bf5364ff3e844755588",
"grade": false,
"grade_id": "cell-2379d0e980554734",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"# Assignment: Dyna-Q and Dyna-Q+"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "76de530741f980cceea89c1cbca751b3",
"grade": false,
"grade_id": "cell-e4a73a1d4819583b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Welcome to this programming assignment! In this notebook, you will:\n",
"1. implement the Dyna-Q and Dyna-Q+ algorithms. \n",
"2. compare their performance on an environment which changes to become 'better' than it was before, that is, the task becomes easier. \n",
"\n",
"We will give you the environment and infrastructure to run the experiment and visualize the performance. The assignment will be graded automatically by comparing the behavior of your agent to our implementations of the algorithms. The random seed will be set explicitly to avoid different behaviors due to randomness. \n",
"\n",
"Please go through the cells in order. "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b5700a0fc8aa27a9871262534a74584d",
"grade": false,
"grade_id": "cell-fc7a8bce812462f8",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"## The Shortcut Maze Environment\n",
"\n",
"In this maze environment, the goal is to reach the goal state (G) as fast as possible from the starting state (S). There are four actions – up, down, right, left – which take the agent deterministically from a state to the corresponding neighboring states, except when movement is blocked by a wall (denoted by grey) or the edge of the maze, in which case the agent remains where it is. The reward is +1 on reaching the goal state, 0 otherwise. On reaching the goal state G, the agent returns to the start state S to being a new episode. This is a discounted, episodic task with $\\gamma = 0.95$.\n",
"\n",
"<img src=\"./images/shortcut_env.png\" alt=\"environment\" width=\"400\"/>\n",
"\n",
"Later in the assignment, we will use a variant of this maze in which a 'shortcut' opens up after a certain number of timesteps. We will test if the the Dyna-Q and Dyna-Q+ agents are able to find the newly-opened shorter route to the goal state."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b5d6eca06a34b6a6e873658478461b95",
"grade": false,
"grade_id": "cell-003d45ed0386900a",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"## Packages\n",
"\n",
"We import the following libraries that are required for this assignment. Primarily, we shall be using the following libraries:\n",
"1. numpy: the fundamental package for scientific computing with Python.\n",
"2. matplotlib: the library for plotting graphs in Python.\n",
"3. RL-Glue: the library for reinforcement learning experiments.\n",
"\n",
"**Please do not import other libraries** as this will break the autograder."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "120eb20b7f1dddd120d76b2aa7919153",
"grade": false,
"grade_id": "cell-bee88a7e78d66006",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import jdc\n",
"import os\n",
"from tqdm import tqdm\n",
"\n",
"from rl_glue import RLGlue\n",
"from agent import BaseAgent\n",
"from maze_env import ShortcutMazeEnvironment"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ee4fd0b140763673eeaa4eb9568f651c",
"grade": false,
"grade_id": "cell-028a2dd8d19ea3a7",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"plt.rcParams.update({'font.size': 15})\n",
"plt.rcParams.update({'figure.figsize': [8,5]})"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8af78c99916d2bef7b8950c06c91ca1b",
"grade": false,
"grade_id": "cell-05b0c5c488d26a90",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"## Section 1: Dyna-Q"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a70fc156a2c433210a5340707627ab14",
"grade": false,
"grade_id": "cell-87547eb7b48d2d80",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Let's start with a quick recap of the tabular Dyna-Q algorithm.\n",
"\n",
"<div style=\"width:80%\"><img src=\"./images/DynaQ.png\" alt=\"DynaQ_pseudocode\"></div>\n",
"\n",
"Dyna-Q involves four basic steps:\n",
"1. Action selection: given an observation, select an action to be performed (here, using the $\\epsilon$-greedy method).\n",
"2. Direct RL: using the observed next state and reward, update the action values (here, using one-step tabular Q-learning).\n",
"3. Model learning: using the observed next state and reward, update the model (here, updating a table as the environment is assumed to be deterministic).\n",
"4. Planning: update the action values by generating $n$ simulated experiences using certain starting states and actions (here, using the random-sample one-step tabular Q-planning method). This is also known as the 'Indirect RL' step. The process of choosing the state and action to simulate an experience with is known as 'search control'.\n",
"\n",
"Steps 1 and 2 are parts of the [tabular Q-learning algorithm](http://www.incompleteideas.net/book/RLbook2018.pdf#page=153) and are denoted by line numbers (a)–(d) in the pseudocode above. Step 3 is performed in line (e), and Step 4 in the block of lines (f).\n",
"\n",
"We highly recommend revising the Dyna videos in the course and the material in the RL textbook (in particular, [Section 8.2](http://www.incompleteideas.net/book/RLbook2018.pdf#page=183))."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "65b87624638d81a162640d0c59868798",
"grade": false,
"grade_id": "cell-feffd3d6e8b4ac8b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Alright, let's begin coding.\n",
"\n",
"As you already know by now, you will develop an agent which interacts with the given environment via RL-Glue. More specifically, you will implement the usual methods `agent_start`, `agent_step`, and `agent_end` in your `DynaQAgent` class, along with a couple of helper methods specific to Dyna-Q, namely `update_model` and `planning_step`. We will provide detailed comments in each method describing what your code should do. "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "30cdeb28f5cf7ee8bfe4844ab7b9624b",
"grade": false,
"grade_id": "cell-d0135622e9f741c2",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Let's break this down in pieces and do it one-by-one.\n",
"\n",
"First of all, check out the `agent_init` method below. As in earlier assignments, some of the attributes are initialized with the data passed inside `agent_info`. In particular, pay attention to the attributes which are new to `DynaQAgent`, since you shall be using them later. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "fcc0e80f7f9aee52e7128caa88d2c7ba",
"grade": false,
"grade_id": "cell-5d0e8c43378d5e30",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"class DynaQAgent(BaseAgent):\n",
"\n",
" def agent_init(self, agent_info):\n",
" \"\"\"Setup for the agent called when the experiment first starts.\n",
"\n",
" Args:\n",
" agent_init_info (dict), the parameters used to initialize the agent. The dictionary contains:\n",
" {\n",
" num_states (int): The number of states,\n",
" num_actions (int): The number of actions,\n",
" epsilon (float): The parameter for epsilon-greedy exploration,\n",
" step_size (float): The step-size,\n",
" discount (float): The discount factor,\n",
" planning_steps (int): The number of planning steps per environmental interaction\n",
"\n",
" random_seed (int): the seed for the RNG used in epsilon-greedy\n",
" planning_random_seed (int): the seed for the RNG used in the planner\n",
" }\n",
" \"\"\"\n",
"\n",
" # First, we get the relevant information from agent_info \n",
" # NOTE: we use np.random.RandomState(seed) to set the two different RNGs\n",
" # for the planner and the rest of the code\n",
" try:\n",
" self.num_states = agent_info[\"num_states\"]\n",
" self.num_actions = agent_info[\"num_actions\"]\n",
" except:\n",
" print(\"You need to pass both 'num_states' and 'num_actions' \\\n",
" in agent_info to initialize the action-value table\")\n",
" self.gamma = agent_info.get(\"discount\", 0.95)\n",
" self.step_size = agent_info.get(\"step_size\", 0.1)\n",
" self.epsilon = agent_info.get(\"epsilon\", 0.1)\n",
" self.planning_steps = agent_info.get(\"planning_steps\", 10)\n",
"\n",
" self.rand_generator = np.random.RandomState(agent_info.get('random_seed', 42))\n",
" self.planning_rand_generator = np.random.RandomState(agent_info.get('planning_random_seed', 42))\n",
"\n",
" # Next, we initialize the attributes required by the agent, e.g., q_values, model, etc.\n",
" # A simple way to implement the model is to have a dictionary of dictionaries, \n",
" # mapping each state to a dictionary which maps actions to (reward, next state) tuples.\n",
" self.q_values = np.zeros((self.num_states, self.num_actions))\n",
" self.actions = list(range(self.num_actions))\n",
" self.past_action = -1\n",
" self.past_state = -1\n",
" self.model = {} # model is a dictionary of dictionaries, which maps states to actions to \n",
" # (reward, next_state) tuples"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0aabcf332aa74c3e7db51eb0b47ab744",
"grade": false,
"grade_id": "cell-ee23a83113d8ed05",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now let's create the `update_model` method, which performs the 'Model Update' step in the pseudocode. It takes a `(s, a, s', r)` tuple and stores the next state and reward corresponding to a state-action pair.\n",
"\n",
"Remember, because the environment is deterministic, an easy way to implement the model is to have a dictionary of encountered states, each mapping to a dictionary of actions taken in those states, which in turn maps to a tuple of next state and reward. In this way, the model can be easily accessed by `model[s][a]`, which would return the `(s', r)` tuple."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "d6dd59f9c730360c26df3035b85ea17a",
"grade": false,
"grade_id": "cell-59c91c0887f0eaea",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
"\n",
"def update_model(self, past_state, past_action, state, reward):\n",
" \"\"\"updates the model \n",
" \n",
" Args:\n",
" past_state (int): s\n",
" past_action (int): a\n",
" state (int): s'\n",
" reward (int): r\n",
" Returns:\n",
" Nothing\n",
" \"\"\"\n",
" # Update the model with the (s,a,s',r) tuple (1~4 lines)\n",
" \n",
" # ----------------\n",
" # your code here\n",
" if past_state not in self.model: \n",
" self.model[past_state] = {}\n",
" self.model[past_state][past_action] = (state, reward)\n",
" \n",
" # ----------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "802b3f2ab731bdccc0adcfc6d4950229",
"grade": false,
"grade_id": "cell-f625328c7bd73d13",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `update_model()`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ab016ddc9bcf9816b2a62407532dede7",
"grade": true,
"grade_id": "cell-d4fa9f9e0a14ccfa",
"locked": true,
"points": 10,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# -----------\n",
"# Tested Cell\n",
"# -----------\n",
"# The contents of the cell will be tested by the autograder.\n",
"# If they do not pass here, they will not pass there.\n",
"\n",
"actions = []\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0, \n",
" \"random_seed\": 0,\n",
" \"planning_random_seed\": 0}\n",
"\n",
"agent = DynaQAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"# (past_state, past_action, state, reward)\n",
"agent.update_model(0,2,0,1)\n",
"agent.update_model(2,0,1,1)\n",
"agent.update_model(0,3,1,2)\n",
"\n",
"expected_model = {\n",
" # action 2 in state 0 leads back to state 0 with a reward of 1\n",
" # or taking action 3 leads to state 1 with reward of 2\n",
" 0: {\n",
" 2: (0, 1),\n",
" 3: (1, 2),\n",
" },\n",
" # taking action 0 in state 2 leads to state 1 with a reward of 1\n",
" 2: {\n",
" 0: (1, 1),\n",
" },\n",
"}\n",
"\n",
"assert agent.model == expected_model\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4ad7e7911407af12a3ad8dea6a0e83fa",
"grade": false,
"grade_id": "cell-a398d6775a6d809a",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Next, you will implement the planning step, the crux of the Dyna-Q algorithm. You shall be calling this `planning_step` method at every timestep of every trajectory."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2c48cb05d902ca761858cc4c81846350",
"grade": false,
"grade_id": "cell-1a90876a079f6ea2",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
"\n",
"def planning_step(self):\n",
" \"\"\"performs planning, i.e. indirect RL.\n",
"\n",
" Args:\n",
" None\n",
" Returns:\n",
" Nothing\n",
" \"\"\"\n",
" \n",
" # The indirect RL step:\n",
" # - Choose a state and action from the set of experiences that are stored in the model. (~2 lines)\n",
" # - Query the model with this state-action pair for the predicted next state and reward.(~1 line)\n",
" # - Update the action values with this simulated experience. (2~4 lines)\n",
" # - Repeat for the required number of planning steps.\n",
" #\n",
" # Note that the update equation is different for terminal and non-terminal transitions. \n",
" # To differentiate between a terminal and a non-terminal next state, assume that the model stores\n",
" # the terminal state as a dummy state like -1\n",
" #\n",
" # Important: remember you have a random number generator 'planning_rand_generator' as \n",
" # a part of the class which you need to use as self.planning_rand_generator.choice()\n",
" # For the sake of reproducibility and grading, *do not* use anything else like \n",
" # np.random.choice() for performing search control.\n",
"\n",
" # ----------------\n",
" # your code here\n",
" counter = 0\n",
" while counter < self.planning_steps:\n",
" past_state = self.planning_rand_generator.choice(list(self.model.keys()))\n",
" past_action = self.planning_rand_generator.choice(list(self.model[past_state].keys()))\n",
" state, reward = self.model[past_state][past_action]\n",
" if state != -1: \n",
" self.q_values[past_state][past_action] += self.step_size*(reward + self.gamma*np.max(self.q_values[state]) - self.q_values[past_state][past_action])\n",
" else:\n",
" self.q_values[past_state][past_action] += self.step_size*(reward - self.q_values[past_state][past_action])\n",
" counter += 1\n",
" # ----------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "edbac5139f888befba4b2696d25fed12",
"grade": false,
"grade_id": "cell-35c7dcb9a38dd319",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `planning_step()` "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f8e02d9152bf919f6755239ef071f37c",
"grade": true,
"grade_id": "cell-8ae4b7a941ad7767",
"locked": true,
"points": 20,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# -----------\n",
"# Tested Cell\n",
"# -----------\n",
"# The contents of the cell will be tested by the autograder.\n",
"# If they do not pass here, they will not pass there.\n",
"\n",
"np.random.seed(0)\n",
"\n",
"actions = []\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0, \n",
" \"planning_steps\": 4,\n",
" \"random_seed\": 0,\n",
" \"planning_random_seed\": 5}\n",
"\n",
"agent = DynaQAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"agent.update_model(0,2,1,1)\n",
"agent.update_model(2,0,1,1)\n",
"agent.update_model(0,3,0,1)\n",
"agent.update_model(0,1,-1,1)\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 2: (1, 1),\n",
" 3: (0, 1),\n",
" 1: (-1, 1),\n",
" },\n",
" 2: {\n",
" 0: (1, 1),\n",
" },\n",
"}\n",
"\n",
"assert agent.model == expected_model\n",
"\n",
"agent.planning_step()\n",
"\n",
"expected_values = np.array([\n",
" [0, 0.1, 0, 0.2],\n",
" [0, 0, 0, 0],\n",
" [0.1, 0, 0, 0],\n",
"])\n",
"assert np.all(np.isclose(agent.q_values, expected_values))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a3534e47ea52ac6c4180d714a0e01e37",
"grade": false,
"grade_id": "cell-02566293dd5feb36",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now before you move on to implement the rest of the agent methods, here are the helper functions that you've used in the previous assessments for choosing an action using an $\\epsilon$-greedy policy."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7d55430e58877032febb23ecb4ba8efd",
"grade": false,
"grade_id": "cell-cc975f6b2f1a6661",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQAgent\n",
"\n",
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def argmax(self, q_values):\n",
" \"\"\"argmax with random tie-breaking\n",
" Args:\n",
" q_values (Numpy array): the array of action values\n",
" Returns:\n",
" action (int): an action with the highest value\n",
" \"\"\"\n",
" top = float(\"-inf\")\n",
" ties = []\n",
"\n",
" for i in range(len(q_values)):\n",
" if q_values[i] > top:\n",
" top = q_values[i]\n",
" ties = []\n",
"\n",
" if q_values[i] == top:\n",
" ties.append(i)\n",
"\n",
" return self.rand_generator.choice(ties)\n",
"\n",
"def choose_action_egreedy(self, state):\n",
" \"\"\"returns an action using an epsilon-greedy policy w.r.t. the current action-value function.\n",
"\n",
" Important: assume you have a random number generator 'rand_generator' as a part of the class\n",
" which you can use as self.rand_generator.choice() or self.rand_generator.rand()\n",
"\n",
" Args:\n",
" state (List): coordinates of the agent (two elements)\n",
" Returns:\n",
" The action taken w.r.t. the aforementioned epsilon-greedy policy\n",
" \"\"\"\n",
"\n",
" if self.rand_generator.rand() < self.epsilon:\n",
" action = self.rand_generator.choice(self.actions)\n",
" else:\n",
" values = self.q_values[state]\n",
" action = self.argmax(values)\n",
"\n",
" return action"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e4704ddcf5cfaad469470f8397c9397d",
"grade": false,
"grade_id": "cell-50858ea1e5f5db91",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Next, you will implement the rest of the agent-related methods, namely `agent_start`, `agent_step`, and `agent_end`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ae45bcd826ba619bf18f2513c80b4079",
"grade": false,
"grade_id": "cell-34d9e8a161d6e5b4",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
"\n",
"def agent_start(self, state):\n",
" \"\"\"The first method called when the experiment starts, \n",
" called after the environment starts.\n",
" Args:\n",
" state (Numpy array): the state from the\n",
" environment's env_start function.\n",
" Returns:\n",
" (int) the first action the agent takes.\n",
" \"\"\"\n",
" \n",
" # given the state, select the action using self.choose_action_egreedy()), \n",
" # and save current state and action (~2 lines)\n",
" ### self.past_state = ?\n",
" ### self.past_action = ?\n",
"\n",
" # ----------------\n",
" # your code here\n",
" self.past_state = state\n",
" self.past_action = self.choose_action_egreedy(state)\n",
" # ----------------\n",
" \n",
" return self.past_action\n",
"\n",
"def agent_step(self, reward, state):\n",
" \"\"\"A step taken by the agent.\n",
"\n",
" Args:\n",
" reward (float): the reward received for taking the last action taken\n",
" state (Numpy array): the state from the\n",
" environment's step based on where the agent ended up after the\n",
" last step\n",
" Returns:\n",
" (int) The action the agent takes given this state.\n",
" \"\"\"\n",
" \n",
" # - Direct-RL step (~1-3 lines)\n",
" # - Model Update step (~1 line)\n",
" # - `planning_step` (~1 line)\n",
" # - Action Selection step (~1 line)\n",
" # Save the current state and action before returning the action to be performed. (~2 lines)\n",
"\n",
" # ----------------\n",
" # your code here\n",
" action = self.choose_action_egreedy(state)\n",
" self.q_values[self.past_state][self.past_action] += self.step_size*(reward + self.gamma*np.max(self.q_values[state]) - self.q_values[self.past_state][self.past_action])\n",
" self.update_model(self.past_state, self.past_action, state, reward)\n",
" self.planning_step()\n",
" self.past_state = state\n",
" self.past_action = action\n",
" # ----------------\n",
" \n",
" return self.past_action\n",
"\n",
"def agent_end(self, reward):\n",
" \"\"\"Called when the agent terminates.\n",
"\n",
" Args:\n",
" reward (float): the reward the agent received for entering the\n",
" terminal state.\n",
" \"\"\"\n",
" \n",
" # - Direct RL update with this final transition (1~2 lines)\n",
" # - Model Update step with this final transition (~1 line)\n",
" # - One final `planning_step` (~1 line)\n",
" #\n",
" # Note: the final transition needs to be handled carefully. Since there is no next state, \n",
" # you will have to pass a dummy state (like -1), which you will be using in the planning_step() to \n",
" # differentiate between updates with usual terminal and non-terminal transitions.\n",
"\n",
" # ----------------\n",
" # your code here\n",
" self.q_values[self.past_state, self.past_action] += self.step_size*(reward + 0 - self.q_values[self.past_state, self.past_action])\n",
" self.update_model(self.past_state, self.past_action, -1, reward)\n",
" self.planning_step()\n",
" # ----------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "684b56621417ff95a833db909acbc2b9",
"grade": false,
"grade_id": "cell-13ed73c6c6df5630",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `agent_start()`, `agent_step()`, and `agent_end()`"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "8ce595f374dc31897a6698cae3652bef",
"grade": true,
"grade_id": "cell-02b41cfa4e281a4f",
"locked": true,
"points": 20,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# -----------\n",
"# Tested Cell\n",
"# -----------\n",
"# The contents of the cell will be tested by the autograder.\n",
"# If they do not pass here, they will not pass there.\n",
"\n",
"np.random.seed(0)\n",
"\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0, \n",
" \"random_seed\": 0,\n",
" \"planning_steps\": 2,\n",
" \"planning_random_seed\": 0}\n",
"\n",
"agent = DynaQAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"# ----------------\n",
"# test agent start\n",
"# ----------------\n",
"\n",
"action = agent.agent_start(0)\n",
"\n",
"assert action == 1\n",
"assert agent.model == {}\n",
"assert np.all(agent.q_values == 0)\n",
"\n",
"# ---------------\n",
"# test agent step\n",
"# ---------------\n",
"\n",
"action = agent.agent_step(1, 2)\n",
"assert action == 3\n",
"\n",
"action = agent.agent_step(0, 1)\n",
"assert action == 1\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 1: (2, 1),\n",
" },\n",
" 2: {\n",
" 3: (1, 0),\n",
" },\n",
"}\n",
"assert agent.model == expected_model\n",
"\n",
"expected_values = np.array([\n",
" [0, 0.3439, 0, 0],\n",
" [0, 0, 0, 0],\n",
" [0, 0, 0, 0],\n",
"])\n",
"assert np.allclose(agent.q_values, expected_values)\n",
"\n",
"# --------------\n",
"# test agent end\n",
"# --------------\n",
"\n",
"agent.agent_end(1)\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 1: (2, 1),\n",
" },\n",
" 2: {\n",
" 3: (1, 0),\n",
" },\n",
" 1: {\n",
" 1: (-1, 1),\n",
" },\n",
"}\n",
"assert agent.model == expected_model\n",
"\n",
"expected_values = np.array([\n",
" [0, 0.41051, 0, 0],\n",
" [0, 0.1, 0, 0],\n",
" [0, 0, 0, 0.01],\n",
"])\n",
"assert np.allclose(agent.q_values, expected_values)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "ebc65986e4b7d2a58cbaa4fc22508593",
"grade": false,
"grade_id": "cell-58a0061ef19de5af",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Experiment: Dyna-Q agent in the maze environment\n",
"\n",
"Alright. Now we have all the components of the `DynaQAgent` ready. Let's try it out on the maze environment! \n",
"\n",
"The next cell runs an experiment on this maze environment to test your implementation. The initial action values are $0$, the step-size parameter is $0.125$. and the exploration parameter is $\\epsilon=0.1$. After the experiment, the sum of rewards in each episode should match the correct result.\n",
"\n",
"We will try planning steps of $0,5,50$ and compare their performance in terms of the average number of steps taken to reach the goal state in the aforementioned maze environment. For scientific rigor, we will run each experiment $30$ times. In each experiment, we set the initial random-number-generator (RNG) seeds for a fair comparison across algorithms."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6f1ce118374c859b81ca1a743bc1bd9b",
"grade": false,
"grade_id": "cell-744f017993777ec8",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def run_experiment(env, agent, env_parameters, agent_parameters, exp_parameters):\n",
"\n",
" # Experiment settings\n",
" num_runs = exp_parameters['num_runs']\n",
" num_episodes = exp_parameters['num_episodes']\n",
" planning_steps_all = agent_parameters['planning_steps']\n",
"\n",
" env_info = env_parameters \n",
" agent_info = {\"num_states\" : agent_parameters[\"num_states\"], # We pass the agent the information it needs. \n",
" \"num_actions\" : agent_parameters[\"num_actions\"],\n",
" \"epsilon\": agent_parameters[\"epsilon\"], \n",
" \"discount\": env_parameters[\"discount\"],\n",
" \"step_size\" : agent_parameters[\"step_size\"]}\n",
"\n",
" all_averages = np.zeros((len(planning_steps_all), num_runs, num_episodes)) # for collecting metrics \n",
" log_data = {'planning_steps_all' : planning_steps_all} # that shall be plotted later\n",
"\n",
" for idx, planning_steps in enumerate(planning_steps_all):\n",
"\n",
" print('Planning steps : ', planning_steps)\n",
" os.system('sleep 0.5') # to prevent tqdm printing out-of-order before the above print()\n",
" agent_info[\"planning_steps\"] = planning_steps \n",
"\n",
" for i in tqdm(range(num_runs)):\n",
"\n",
" agent_info['random_seed'] = i\n",
" agent_info['planning_random_seed'] = i\n",
"\n",
" rl_glue = RLGlue(env, agent) # Creates a new RLGlue experiment with the env and agent we chose above\n",
" rl_glue.rl_init(agent_info, env_info) # We pass RLGlue what it needs to initialize the agent and environment\n",
"\n",
" for j in range(num_episodes):\n",
"\n",
" rl_glue.rl_start() # We start an episode. Here we aren't using rl_glue.rl_episode()\n",
" # like the other assessments because we'll be requiring some \n",
" is_terminal = False # data from within the episodes in some of the experiments here \n",
" num_steps = 0\n",
" while not is_terminal:\n",
" reward, _, action, is_terminal = rl_glue.rl_step() # The environment and agent take a step \n",
" num_steps += 1 # and return the reward and action taken.\n",
"\n",
" all_averages[idx][i][j] = num_steps\n",
"\n",
" log_data['all_averages'] = all_averages\n",
" \n",
" return log_data\n",
" \n",
"\n",
"def plot_steps_per_episode(data):\n",
" all_averages = data['all_averages']\n",
" planning_steps_all = data['planning_steps_all']\n",
"\n",
" for i, planning_steps in enumerate(planning_steps_all):\n",
" plt.plot(np.mean(all_averages[i], axis=0), label='Planning steps = '+str(planning_steps))\n",
"\n",
" plt.legend(loc='upper right')\n",
" plt.xlabel('Episodes')\n",
" plt.ylabel('Steps\\nper\\nepisode', rotation=0, labelpad=40)\n",
" plt.axhline(y=16, linestyle='--', color='grey', alpha=0.4)\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f4b740a35fbe720e8ecc73ade69dd3cd",
"grade": false,
"grade_id": "cell-b7c90063cc0888e0",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:07<00:00, 4.02it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 5\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:09<00:00, 3.31it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 50\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:56<00:00, 1.89s/it]\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"# Experiment parameters\n",
"experiment_parameters = {\n",
" \"num_runs\" : 30, # The number of times we run the experiment\n",
" \"num_episodes\" : 40, # The number of episodes per experiment\n",
"}\n",
"\n",
"# Environment parameters\n",
"environment_parameters = { \n",
" \"discount\": 0.95,\n",
"}\n",
"\n",
"# Agent parameters\n",
"agent_parameters = { \n",
" \"num_states\" : 54,\n",
" \"num_actions\" : 4, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\" : 0.125,\n",
" \"planning_steps\" : [0, 5, 50] # The list of planning_steps we want to try\n",
"}\n",
"\n",
"current_env = ShortcutMazeEnvironment # The environment\n",
"current_agent = DynaQAgent # The agent\n",
"\n",
"dataq = run_experiment(current_env, current_agent, environment_parameters, agent_parameters, experiment_parameters)\n",
"plot_steps_per_episode(dataq) "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "81c7635453f9c560e71d536f7e7be762",
"grade": false,
"grade_id": "cell-a44baca574f0e70c",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"What do you notice?\n",
"\n",
"As the number of planning steps increases, the number of episodes taken to reach the goal decreases rapidly. Remember that the RNG seed was set the same for all the three values of planning steps, resulting in the same number of steps taken to reach the goal in the first episode. Thereafter, the performance improves. The slowest improvement is when there are $n=0$ planning steps, i.e., for the non-planning Q-learning agent, even though the step size parameter was optimized for it. Note that the grey dotted line shows the minimum number of steps required to reach the goal state under the optimal greedy policy.\n",
"\n",
"---\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "92986c0d6a6e9acfaf3cbab5ebafbf49",
"grade": false,
"grade_id": "cell-753d3ebd700359e6",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Experiment(s): Dyna-Q agent in the _changing_ maze environment "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "dd09e132177a8cc9b4a061de27754ad4",
"grade": false,
"grade_id": "cell-aa3974b49e4eda2f",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Great! Now let us see how Dyna-Q performs on the version of the maze in which a shorter path opens up after 3000 steps. The rest of the transition and reward dynamics remain the same. \n",
"\n",
"<img src=\"./images/shortcut_env_after.png\" alt=\"environment\" width=\"800\"/>\n",
"\n",
"Before you proceed, take a moment to think about what you expect to see. Will Dyna-Q find the new, shorter path to the goal? If so, why? If not, why not?"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e89fe28e52a88aeed2388ac7afad4ab3",
"grade": false,
"grade_id": "cell-422bb22d0465830f",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def run_experiment_with_state_visitations(env, agent, env_parameters, agent_parameters, exp_parameters, result_file_name):\n",
"\n",
" # Experiment settings\n",
" num_runs = exp_parameters['num_runs']\n",
" num_max_steps = exp_parameters['num_max_steps']\n",
" planning_steps_all = agent_parameters['planning_steps']\n",
"\n",
" env_info = {\"change_at_n\" : env_parameters[\"change_at_n\"]} \n",
" agent_info = {\"num_states\" : agent_parameters[\"num_states\"], \n",
" \"num_actions\" : agent_parameters[\"num_actions\"],\n",
" \"epsilon\": agent_parameters[\"epsilon\"], \n",
" \"discount\": env_parameters[\"discount\"],\n",
" \"step_size\" : agent_parameters[\"step_size\"]}\n",
"\n",
" state_visits_before_change = np.zeros((len(planning_steps_all), num_runs, 54)) # For saving the number of\n",
" state_visits_after_change = np.zeros((len(planning_steps_all), num_runs, 54)) # state-visitations \n",
" cum_reward_all = np.zeros((len(planning_steps_all), num_runs, num_max_steps)) # For saving the cumulative reward\n",
" log_data = {'planning_steps_all' : planning_steps_all}\n",
"\n",
" for idx, planning_steps in enumerate(planning_steps_all):\n",
"\n",
" print('Planning steps : ', planning_steps)\n",
" os.system('sleep 1') # to prevent tqdm printing out-of-order before the above print()\n",
" agent_info[\"planning_steps\"] = planning_steps # We pass the agent the information it needs. \n",
"\n",
" for run in tqdm(range(num_runs)):\n",
"\n",
" agent_info['random_seed'] = run\n",
" agent_info['planning_random_seed'] = run\n",
"\n",
" rl_glue = RLGlue(env, agent) # Creates a new RLGlue experiment with the env and agent we chose above\n",
" rl_glue.rl_init(agent_info, env_info) # We pass RLGlue what it needs to initialize the agent and environment\n",
"\n",
" num_steps = 0\n",
" cum_reward = 0\n",
"\n",
" while num_steps < num_max_steps-1 :\n",
"\n",
" state, _ = rl_glue.rl_start() # We start the experiment. We'll be collecting the \n",
" is_terminal = False # state-visitation counts to visiualize the learned policy\n",
" if num_steps < env_parameters[\"change_at_n\"]: \n",
" state_visits_before_change[idx][run][state] += 1\n",
" else:\n",
" state_visits_after_change[idx][run][state] += 1\n",
"\n",
" while not is_terminal and num_steps < num_max_steps-1 :\n",
" reward, state, action, is_terminal = rl_glue.rl_step() \n",
" num_steps += 1\n",
" cum_reward += reward\n",
" cum_reward_all[idx][run][num_steps] = cum_reward\n",
" if num_steps < env_parameters[\"change_at_n\"]:\n",
" state_visits_before_change[idx][run][state] += 1\n",
" else:\n",
" state_visits_after_change[idx][run][state] += 1\n",
"\n",
" log_data['state_visits_before'] = state_visits_before_change\n",
" log_data['state_visits_after'] = state_visits_after_change\n",
" log_data['cum_reward_all'] = cum_reward_all\n",
" \n",
" return log_data\n",
"\n",
"def plot_cumulative_reward(data_all, item_key, y_key, y_axis_label, legend_prefix, title):\n",
" data_y_all = data_all[y_key]\n",
" items = data_all[item_key]\n",
"\n",
" for i, item in enumerate(items):\n",
" plt.plot(np.mean(data_y_all[i], axis=0), label=legend_prefix+str(item))\n",
"\n",
" plt.axvline(x=3000, linestyle='--', color='grey', alpha=0.4)\n",
" plt.xlabel('Timesteps')\n",
" plt.ylabel(y_axis_label, rotation=0, labelpad=60)\n",
" plt.legend(loc='upper left')\n",
" plt.title(title)\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "ed82204e60d5cda36d818ca9bf653710",
"grade": false,
"grade_id": "cell-142b14ac90c9bff7",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Did you notice that the environment changes after a fixed number of _steps_ and not episodes? \n",
"\n",
"This is because the environment is separate from the agent, and the environment changes irrespective of the length of each episode (i.e., the number of environmental interactions per episode) that the agent perceives. And hence we are now plotting the data per step or interaction of the agent and the environment, in order to comfortably see the differences in the behaviours of the agents before and after the environment changes. "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4f802c06e5b1eb84585c6876ac3f2dd3",
"grade": false,
"grade_id": "cell-0b246e0fe5abb018",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Okay, now we will first plot the cumulative reward obtained by the agent per interaction with the environment, averaged over 10 runs of the experiment on this changing world. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "20b0026f54442a7ba37d7096128e03ed",
"grade": false,
"grade_id": "cell-9f7872900ce6b40f",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 5\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 10/10 [00:10<00:00, 1.01s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 10\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 10/10 [00:17<00:00, 1.73s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 50\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 10/10 [01:15<00:00, 7.58s/it]\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"# Experiment parameters\n",
"experiment_parameters = {\n",
" \"num_runs\" : 10, # The number of times we run the experiment\n",
" \"num_max_steps\" : 6000, # The number of steps per experiment\n",
"}\n",
"\n",
"# Environment parameters\n",
"environment_parameters = { \n",
" \"discount\": 0.95,\n",
" \"change_at_n\": 3000\n",
"}\n",
"\n",
"# Agent parameters\n",
"agent_parameters = { \n",
" \"num_states\" : 54,\n",
" \"num_actions\" : 4, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\" : 0.125,\n",
" \"planning_steps\" : [5, 10, 50] # The list of planning_steps we want to try\n",
"}\n",
"\n",
"current_env = ShortcutMazeEnvironment # The environment\n",
"current_agent = DynaQAgent # The agent\n",
"\n",
"dataq = run_experiment_with_state_visitations(current_env, current_agent, environment_parameters, agent_parameters, experiment_parameters, \"Dyna-Q_shortcut_steps\") \n",
"plot_cumulative_reward(dataq, 'planning_steps_all', 'cum_reward_all', 'Cumulative\\nreward', 'Planning steps = ', 'Dyna-Q : Varying planning_steps')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "ce1264bf93c93926107e736687bfe3ab",
"grade": false,
"grade_id": "cell-ae67d282ebad19ad",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"We observe that the slope of the curves is almost constant. If the agent had discovered the shortcut and begun using it, we would expect to see an increase in the slope of the curves towards the later stages of training. This is because the agent can get to the goal state faster and get the positive reward. Note that the timestep at which the shortcut opens up is marked by the grey dotted line.\n",
"\n",
"Note that this trend is constant across the increasing number of planning steps.\n",
"\n",
"Now let's check the heatmap of the state visitations of the agent with `planning_steps=10` during training, before and after the shortcut opens up after 3000 timesteps."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "bfe46c5772be65c97fa8ba81d947f985",
"grade": false,
"grade_id": "cell-c21d98bc4f7296d6",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def plot_state_visitations(data, plot_titles, idx):\n",
" data_keys = [\"state_visits_before\", \"state_visits_after\"]\n",
" positions = [211,212]\n",
" titles = plot_titles\n",
" wall_ends = [None,-1]\n",
"\n",
" for i in range(2):\n",
"\n",
" state_visits = data[data_keys[i]][idx]\n",
" average_state_visits = np.mean(state_visits, axis=0)\n",
" grid_state_visits = np.rot90(average_state_visits.reshape((6,9)).T)\n",
" grid_state_visits[2,1:wall_ends[i]] = np.nan # walls\n",
" #print(average_state_visits.reshape((6,9)))\n",
" plt.subplot(positions[i])\n",
" plt.pcolormesh(grid_state_visits, edgecolors='gray', linewidth=1, cmap='viridis')\n",
" plt.text(3+0.5, 0+0.5, 'S', horizontalalignment='center', verticalalignment='center')\n",
" plt.text(8+0.5, 5+0.5, 'G', horizontalalignment='center', verticalalignment='center')\n",
" plt.title(titles[i])\n",
" plt.axis('off')\n",
" cm = plt.get_cmap()\n",
" cm.set_bad('gray')\n",
"\n",
" plt.subplots_adjust(bottom=0.0, right=0.7, top=1.0)\n",
" cax = plt.axes([1., 0.0, 0.075, 1.])\n",
" cbar = plt.colorbar(cax=cax)\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ee68fcbd81419dd6d30abaaa38f5a48d",
"grade": false,
"grade_id": "cell-aa17be852a4fa1e1",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x360 with 3 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Do not modify this cell!\n",
"\n",
"plot_state_visitations(dataq, ['Dyna-Q : State visitations before the env changes', 'Dyna-Q : State visitations after the env changes'], 1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0979f12aeeebfa64035c9f27fc407d97",
"grade": false,
"grade_id": "cell-50778038da2d7233",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"What do you observe?\n",
"\n",
"The state visitation map looks almost the same before and after the shortcut opens. This means that the Dyna-Q agent hasn't quite discovered and started exploiting the new shortcut.\n",
"\n",
"Now let's try increasing the exploration parameter $\\epsilon$ to see if it helps the Dyna-Q agent discover the shortcut. "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c2dcbc40b05319c4b4efc75ae0128e4d",
"grade": false,
"grade_id": "cell-27a96a3ebc8bd13a",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def run_experiment_only_cumulative_reward(env, agent, env_parameters, agent_parameters, exp_parameters):\n",
"\n",
" # Experiment settings\n",
" num_runs = exp_parameters['num_runs']\n",
" num_max_steps = exp_parameters['num_max_steps']\n",
" epsilons = agent_parameters['epsilons']\n",
"\n",
" env_info = {\"change_at_n\" : env_parameters[\"change_at_n\"]} \n",
" agent_info = {\"num_states\" : agent_parameters[\"num_states\"], \n",
" \"num_actions\" : agent_parameters[\"num_actions\"],\n",
" \"planning_steps\": agent_parameters[\"planning_steps\"], \n",
" \"discount\": env_parameters[\"discount\"],\n",
" \"step_size\" : agent_parameters[\"step_size\"]}\n",
"\n",
" log_data = {'epsilons' : epsilons} \n",
" cum_reward_all = np.zeros((len(epsilons), num_runs, num_max_steps))\n",
"\n",
" for eps_idx, epsilon in enumerate(epsilons):\n",
"\n",
" print('Agent : Dyna-Q, epsilon : %f' % epsilon)\n",
" os.system('sleep 1') # to prevent tqdm printing out-of-order before the above print()\n",
" agent_info[\"epsilon\"] = epsilon\n",
"\n",
" for run in tqdm(range(num_runs)):\n",
"\n",
" agent_info['random_seed'] = run\n",
" agent_info['planning_random_seed'] = run\n",
"\n",
" rl_glue = RLGlue(env, agent) # Creates a new RLGlue experiment with the env and agent we chose above\n",
" rl_glue.rl_init(agent_info, env_info) # We pass RLGlue what it needs to initialize the agent and environment\n",
"\n",
" num_steps = 0\n",
" cum_reward = 0\n",
"\n",
" while num_steps < num_max_steps-1 :\n",
"\n",
" rl_glue.rl_start() # We start the experiment\n",
" is_terminal = False\n",
"\n",
" while not is_terminal and num_steps < num_max_steps-1 :\n",
" reward, _, action, is_terminal = rl_glue.rl_step() # The environment and agent take a step and return\n",
" # the reward, and action taken.\n",
" num_steps += 1\n",
" cum_reward += reward\n",
" cum_reward_all[eps_idx][run][num_steps] = cum_reward\n",
"\n",
" log_data['cum_reward_all'] = cum_reward_all\n",
" return log_data"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "35b1244013e6641a28af6ee1c5e19020",
"grade": false,
"grade_id": "cell-7e4c0e42c445b2dc",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Agent : Dyna-Q, epsilon : 0.100000\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:51<00:00, 1.72s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Agent : Dyna-Q, epsilon : 0.200000\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:52<00:00, 1.74s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Agent : Dyna-Q, epsilon : 0.400000\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:50<00:00, 1.69s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Agent : Dyna-Q, epsilon : 0.800000\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [00:51<00:00, 1.71s/it]\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"# Experiment parameters\n",
"experiment_parameters = {\n",
" \"num_runs\" : 30, # The number of times we run the experiment\n",
" \"num_max_steps\" : 6000, # The number of steps per experiment\n",
"}\n",
"\n",
"# Environment parameters\n",
"environment_parameters = { \n",
" \"discount\": 0.95,\n",
" \"change_at_n\": 3000\n",
"}\n",
"\n",
"# Agent parameters\n",
"agent_parameters = { \n",
" \"num_states\" : 54,\n",
" \"num_actions\" : 4, \n",
" \"step_size\" : 0.125,\n",
" \"planning_steps\" : 10,\n",
" \"epsilons\": [0.1, 0.2, 0.4, 0.8] # The list of epsilons we want to try\n",
"}\n",
"\n",
"current_env = ShortcutMazeEnvironment # The environment\n",
"current_agent = DynaQAgent # The agent\n",
"\n",
"data = run_experiment_only_cumulative_reward(current_env, current_agent, environment_parameters, agent_parameters, experiment_parameters)\n",
"plot_cumulative_reward(data, 'epsilons', 'cum_reward_all', 'Cumulative\\nreward', r'$\\epsilon$ = ', r'Dyna-Q : Varying $\\epsilon$')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3e41acbeb9782671cdca735c33cf9b16",
"grade": false,
"grade_id": "cell-8159dc6c61e345f9",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"What do you observe?\n",
"\n",
"Increasing the exploration via the $\\epsilon$-greedy strategy does not seem to be helping. In fact, the agent's cumulative reward decreases because it is spending more and more time trying out the exploratory actions.\n",
"\n",
"Can we do better...? "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "88675c8ce603f560311089a74104f394",
"grade": false,
"grade_id": "cell-62df4f966a370995",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"## Section 2: Dyna-Q+"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "15faa0c27e0b1427655f666914540c23",
"grade": false,
"grade_id": "cell-7961458a916a28a8",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"The motivation behind Dyna-Q+ is to give a bonus reward for actions that haven't been tried for a long time, since there is a greater chance that the dynamics for that actions might have changed.\n",
"\n",
"In particular, if the modeled reward for a transition is $r$, and the transition has not been tried in $\\tau(s,a)$ time steps, then planning updates are done as if that transition produced a reward of $r + \\kappa \\sqrt{ \\tau(s,a)}$, for some small $\\kappa$. \n",
"\n",
"Let's implement that!\n",
"\n",
"Based on your `DynaQAgent`, create a new class `DynaQPlusAgent` to implement the aforementioned exploration heuristic. Additionally :\n",
"1. actions that had never been tried before from a state should now be allowed to be considered in the planning step,\n",
"2. and the initial model for such actions is that they lead back to the same state with a reward of zero.\n",
"\n",
"At this point, you might want to refer to the video lectures and [Section 8.3](http://www.incompleteideas.net/book/RLbook2018.pdf#page=188) of the RL textbook for a refresher on Dyna-Q+."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "fc1df956ada702fea2fdd43be25d2144",
"grade": false,
"grade_id": "cell-5cb32fc5b37ad166",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"As usual, let's break this down in pieces and do it one-by-one.\n",
"\n",
"First of all, check out the `agent_init` method below. In particular, pay attention to the attributes which are new to `DynaQPlusAgent`– state-visitation counts $\\tau$ and the scaling parameter $\\kappa$ – because you shall be using them later. "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f941a227e6e8174f497769e87d5968b5",
"grade": false,
"grade_id": "cell-539ab8af016fc473",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"class DynaQPlusAgent(BaseAgent):\n",
" \n",
" def agent_init(self, agent_info):\n",
" \"\"\"Setup for the agent called when the experiment first starts.\n",
"\n",
" Args:\n",
" agent_init_info (dict), the parameters used to initialize the agent. The dictionary contains:\n",
" {\n",
" num_states (int): The number of states,\n",
" num_actions (int): The number of actions,\n",
" epsilon (float): The parameter for epsilon-greedy exploration,\n",
" step_size (float): The step-size,\n",
" discount (float): The discount factor,\n",
" planning_steps (int): The number of planning steps per environmental interaction\n",
" kappa (float): The scaling factor for the reward bonus\n",
"\n",
" random_seed (int): the seed for the RNG used in epsilon-greedy\n",
" planning_random_seed (int): the seed for the RNG used in the planner\n",
" }\n",
" \"\"\"\n",
"\n",
" # First, we get the relevant information from agent_info \n",
" # Note: we use np.random.RandomState(seed) to set the two different RNGs\n",
" # for the planner and the rest of the code\n",
" try:\n",
" self.num_states = agent_info[\"num_states\"]\n",
" self.num_actions = agent_info[\"num_actions\"]\n",
" except:\n",
" print(\"You need to pass both 'num_states' and 'num_actions' \\\n",
" in agent_info to initialize the action-value table\")\n",
" self.gamma = agent_info.get(\"discount\", 0.95)\n",
" self.step_size = agent_info.get(\"step_size\", 0.1)\n",
" self.epsilon = agent_info.get(\"epsilon\", 0.1)\n",
" self.planning_steps = agent_info.get(\"planning_steps\", 10)\n",
" self.kappa = agent_info.get(\"kappa\", 0.001)\n",
"\n",
" self.rand_generator = np.random.RandomState(agent_info.get('random_seed', 42))\n",
" self.planning_rand_generator = np.random.RandomState(agent_info.get('planning_random_seed', 42))\n",
"\n",
" # Next, we initialize the attributes required by the agent, e.g., q_values, model, tau, etc.\n",
" # The visitation-counts can be stored as a table as well, like the action values \n",
" self.q_values = np.zeros((self.num_states, self.num_actions))\n",
" self.tau = np.zeros((self.num_states, self.num_actions))\n",
" self.actions = list(range(self.num_actions))\n",
" self.past_action = -1\n",
" self.past_state = -1\n",
" self.model = {}"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "1a7b620740e82640f572213177bee2ef",
"grade": false,
"grade_id": "cell-1cad0227d9ff16d5",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now first up, implement the `update_model` method. Note that this is different from Dyna-Q in the aforementioned way.\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ff36e4ae144e4409bd1ea34b1918000f",
"grade": false,
"grade_id": "cell-d4452e4cd395456a",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQPlusAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
"\n",
"def update_model(self, past_state, past_action, state, reward):\n",
" \"\"\"updates the model \n",
"\n",
" Args:\n",
" past_state (int): s\n",
" past_action (int): a\n",
" state (int): s'\n",
" reward (int): r\n",
" Returns:\n",
" Nothing\n",
" \"\"\"\n",
"\n",
" # Recall that when adding a state-action to the model, if the agent is visiting the state\n",
" # for the first time, then the remaining actions need to be added to the model as well\n",
" # with zero reward and a transition into itself.\n",
" #\n",
" # Note: do *not* update the visitation-counts here. We will do that in `agent_step`.\n",
" #\n",
" # (3 lines)\n",
"\n",
" if past_state not in self.model:\n",
" self.model[past_state] = {past_action : (state, reward)}\n",
" # ----------------\n",
" # your code here\n",
" for i in range(self.num_actions):\n",
" self.model[past_state][i]=(past_state,0)\n",
" self.model[past_state][past_action] = (state, reward)\n",
" # ----------------\n",
" else:\n",
" self.model[past_state][past_action] = (state, reward)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a9c44b9a6b276c0e08312dec0d413076",
"grade": false,
"grade_id": "cell-a44ec8b7ac701e0c",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `update_model()`"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "fc850bdd9ff71c46e5e9b7246c7625d4",
"grade": true,
"grade_id": "cell-8cdef71644d2952f",
"locked": true,
"points": 5,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# -----------\n",
"# Tested Cell\n",
"# -----------\n",
"# The contents of the cell will be tested by the autograder.\n",
"# If they do not pass here, they will not pass there.\n",
"\n",
"actions = []\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0, \n",
" \"random_seed\": 0,\n",
" \"planning_random_seed\": 0}\n",
"\n",
"agent = DynaQPlusAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"agent.update_model(0,2,0,1)\n",
"agent.update_model(2,0,1,1)\n",
"agent.update_model(0,3,1,2)\n",
"agent.tau[0][0] += 1\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 0: (0, 0),\n",
" 1: (0, 0),\n",
" 2: (0, 1),\n",
" 3: (1, 2),\n",
" },\n",
" 2: {\n",
" 0: (1, 1),\n",
" 1: (2, 0),\n",
" 2: (2, 0),\n",
" 3: (2, 0),\n",
" },\n",
"}\n",
"assert agent.model == expected_model"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9c1771a9ba649fde3e588bae3022e161",
"grade": false,
"grade_id": "cell-885fe1cd5447e0b0",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Next, you will implement the `planning_step()` method. This will be very similar to the one you implemented in `DynaQAgent`, but here you will be adding the exploration bonus to the reward in the simulated transition."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6ef80ec707602f554d0a56412d066855",
"grade": false,
"grade_id": "cell-b3605364bf724124",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQPlusAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
"\n",
"def planning_step(self):\n",
" \"\"\"performs planning, i.e. indirect RL.\n",
"\n",
" Args:\n",
" None\n",
" Returns:\n",
" Nothing\n",
" \"\"\"\n",
" \n",
" # The indirect RL step:\n",
" # - Choose a state and action from the set of experiences that are stored in the model. (~2 lines)\n",
" # - Query the model with this state-action pair for the predicted next state and reward.(~1 line)\n",
" # - **Add the bonus to the reward** (~1 line)\n",
" # - Update the action values with this simulated experience. (2~4 lines)\n",
" # - Repeat for the required number of planning steps.\n",
" #\n",
" # Note that the update equation is different for terminal and non-terminal transitions. \n",
" # To differentiate between a terminal and a non-terminal next state, assume that the model stores\n",
" # the terminal state as a dummy state like -1\n",
" #\n",
" # Important: remember you have a random number generator 'planning_rand_generator' as \n",
" # a part of the class which you need to use as self.planning_rand_generator.choice()\n",
" # For the sake of reproducibility and grading, *do not* use anything else like \n",
" # np.random.choice() for performing search control.\n",
"\n",
" # ----------------\n",
" # your code here\n",
" counter = 0\n",
" while counter < self.planning_steps:\n",
" past_state = self.planning_rand_generator.choice(list(self.model.keys()))\n",
" past_action = self.planning_rand_generator.choice(list(self.model[past_state].keys()))\n",
" state, reward = self.model[past_state][past_action]\n",
" reward += self.kappa*(self.tau[past_state][past_action])**0.5\n",
" if state != -1: \n",
" self.q_values[past_state][past_action] += self.step_size*(reward + self.gamma*np.max(self.q_values[state]) - self.q_values[past_state][past_action])\n",
" else:\n",
" self.q_values[past_state][past_action] += self.step_size*(reward - self.q_values[past_state][past_action])\n",
" counter += 1\n",
" # ----------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4d4d85edc08c8307d5a7072c79c30aad",
"grade": false,
"grade_id": "cell-0df5e5a11dce577b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `planning_step()`"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "506a78d3a89c1a04c8f59e6a69515623",
"grade": true,
"grade_id": "cell-1bae4d3c34b953a2",
"locked": true,
"points": 10,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# Do not modify this cell!\n",
"\n",
"## Test code for planning_step() ##\n",
"\n",
"actions = []\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0, \n",
" \"kappa\": 0.001,\n",
" \"planning_steps\": 4,\n",
" \"random_seed\": 0,\n",
" \"planning_random_seed\": 1}\n",
"\n",
"agent = DynaQPlusAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"agent.update_model(0,1,-1,1)\n",
"agent.tau += 1\n",
"agent.tau[0][1] = 0\n",
"\n",
"agent.update_model(0,2,1,1)\n",
"agent.tau += 1\n",
"agent.tau[0][2] = 0\n",
"\n",
"agent.update_model(2,0,1,1)\n",
"agent.tau += 1\n",
"agent.tau[2][0] = 0\n",
"\n",
"agent.planning_step()\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 1: (-1, 1), \n",
" 0: (0, 0), \n",
" 2: (1, 1), \n",
" 3: (0, 0),\n",
" }, \n",
" 2: {\n",
" 0: (1, 1), \n",
" 1: (2, 0), \n",
" 2: (2, 0), \n",
" 3: (2, 0),\n",
" },\n",
"}\n",
"assert agent.model == expected_model\n",
"\n",
"expected_values = np.array([\n",
" [0, 0.10014142, 0, 0],\n",
" [0, 0, 0, 0],\n",
" [0, 0.00036373, 0, 0.00017321],\n",
"])\n",
"assert np.allclose(agent.q_values, expected_values)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0463f44477f43a3e5ac587a664caf3e9",
"grade": false,
"grade_id": "cell-49b8bb85128d50f3",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Again, before you move on to implement the rest of the agent methods, here are the couple of helper functions that you've used in the previous assessments for choosing an action using an $\\epsilon$-greedy policy."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "81bcd74d211cf70c7259d7e035ed6393",
"grade": false,
"grade_id": "cell-0550ca807b59d14c",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQPlusAgent\n",
"\n",
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def argmax(self, q_values):\n",
" \"\"\"argmax with random tie-breaking\n",
" Args:\n",
" q_values (Numpy array): the array of action values\n",
" Returns:\n",
" action (int): an action with the highest value\n",
" \"\"\"\n",
" top = float(\"-inf\")\n",
" ties = []\n",
"\n",
" for i in range(len(q_values)):\n",
" if q_values[i] > top:\n",
" top = q_values[i]\n",
" ties = []\n",
"\n",
" if q_values[i] == top:\n",
" ties.append(i)\n",
"\n",
" return self.rand_generator.choice(ties)\n",
"\n",
"def choose_action_egreedy(self, state):\n",
" \"\"\"returns an action using an epsilon-greedy policy w.r.t. the current action-value function.\n",
"\n",
" Important: assume you have a random number generator 'rand_generator' as a part of the class\n",
" which you can use as self.rand_generator.choice() or self.rand_generator.rand()\n",
"\n",
" Args:\n",
" state (List): coordinates of the agent (two elements)\n",
" Returns:\n",
" The action taken w.r.t. the aforementioned epsilon-greedy policy\n",
" \"\"\"\n",
"\n",
" if self.rand_generator.rand() < self.epsilon:\n",
" action = self.rand_generator.choice(self.actions)\n",
" else:\n",
" values = self.q_values[state]\n",
" action = self.argmax(values)\n",
"\n",
" return action"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "cfc05c6dac5be58f8070c05bcab23dc4",
"grade": false,
"grade_id": "cell-ff89fce4c62dd24b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now implement the rest of the agent-related methods, namely `agent_start`, `agent_step`, and `agent_end`. Again, these will be very similar to the ones in the `DynaQAgent`, but you will have to think of a way to update the counts since the last visit."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9ea6edbc6526bfb8d57d8d6a03514ba1",
"grade": false,
"grade_id": "cell-675ebe1d175f5730",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"%%add_to DynaQPlusAgent\n",
"\n",
"# -----------\n",
"# Graded Cell\n",
"# -----------\n",
" \n",
"def agent_start(self, state):\n",
" \"\"\"The first method called when the experiment starts, called after\n",
" the environment starts.\n",
" Args:\n",
" state (Numpy array): the state from the\n",
" environment's env_start function.\n",
" Returns:\n",
" (int) The first action the agent takes.\n",
" \"\"\"\n",
" \n",
" # given the state, select the action using self.choose_action_egreedy(), \n",
" # and save current state and action (~2 lines)\n",
" ### self.past_state = ?\n",
" ### self.past_action = ?\n",
" # Note that the last-visit counts are not updated here.\n",
" \n",
" # ----------------\n",
" # your code here\n",
" self.past_state = state\n",
" self.past_action = self.choose_action_egreedy(state)\n",
" # ----------------\n",
" \n",
" return self.past_action\n",
"\n",
"def agent_step(self, reward, state):\n",
" \"\"\"A step taken by the agent.\n",
" Args:\n",
" reward (float): the reward received for taking the last action taken\n",
" state (Numpy array): the state from the\n",
" environment's step based on where the agent ended up after the\n",
" last step\n",
" Returns:\n",
" (int) The action the agent is taking.\n",
" \"\"\" \n",
" \n",
" # Update the last-visited counts (~2 lines)\n",
" # - Direct-RL step (1~3 lines)\n",
" # - Model Update step (~1 line)\n",
" # - `planning_step` (~1 line)\n",
" # - Action Selection step (~1 line)\n",
" # Save the current state and action before returning the action to be performed. (~2 lines)\n",
" \n",
" # ----------------\n",
" # your code here\n",
" self.tau += 1\n",
" self.tau[self.past_state, self.past_action] = 0\n",
" action = self.choose_action_egreedy(state)\n",
" self.q_values[self.past_state][self.past_action] += self.step_size*(reward + self.gamma*np.max(self.q_values[state]) - self.q_values[self.past_state][self.past_action])\n",
" self.update_model(self.past_state, self.past_action, state, reward)\n",
" self.planning_step()\n",
" self.past_state = state\n",
" self.past_action = action\n",
" \n",
" # ----------------\n",
" \n",
" return self.past_action\n",
"\n",
"def agent_end(self, reward):\n",
" \"\"\"Called when the agent terminates.\n",
" Args:\n",
" reward (float): the reward the agent received for entering the\n",
" terminal state.\n",
" \"\"\"\n",
" # Again, add the same components you added in agent_step to augment Dyna-Q into Dyna-Q+\n",
" \n",
" # ----------------\n",
" # your code here\n",
" self.tau += 1\n",
" self.tau[self.past_state, self.past_action] = 0\n",
" self.q_values[self.past_state, self.past_action] += self.step_size*(reward + 0 - self.q_values[self.past_state, self.past_action])\n",
" self.update_model(self.past_state, self.past_action, -1, reward)\n",
" self.planning_step()\n",
" # ----------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "375c9af20c23fbafe952776276d580dd",
"grade": false,
"grade_id": "cell-05300ec8845616b2",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Test `agent_start()`, `agent_step()`, and `agent_end()`"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "44a3a0b6fcb2e7f37c933bd18ff378f8",
"grade": true,
"grade_id": "cell-9cf838836ad39efb",
"locked": true,
"points": 15,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# -----------\n",
"# Tested Cell\n",
"# -----------\n",
"# The contents of the cell will be tested by the autograder.\n",
"# If they do not pass here, they will not pass there.\n",
"\n",
"agent_info = {\"num_actions\": 4, \n",
" \"num_states\": 3, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\": 0.1, \n",
" \"discount\": 1.0,\n",
" \"kappa\": 0.001,\n",
" \"random_seed\": 0,\n",
" \"planning_steps\": 4,\n",
" \"planning_random_seed\": 0}\n",
"\n",
"agent = DynaQPlusAgent()\n",
"agent.agent_init(agent_info)\n",
"\n",
"action = agent.agent_start(0) # state\n",
"assert action == 1\n",
"\n",
"assert np.allclose(agent.tau, 0)\n",
"assert np.allclose(agent.q_values, 0)\n",
"assert agent.model == {}\n",
"\n",
"# ---------------\n",
"# test agent step\n",
"# ---------------\n",
"\n",
"action = agent.agent_step(1, 2)\n",
"assert action == 3\n",
"\n",
"action = agent.agent_step(0, 1)\n",
"assert action == 1\n",
"\n",
"expected_tau = np.array([\n",
" [2, 1, 2, 2],\n",
" [2, 2, 2, 2],\n",
" [2, 2, 2, 0],\n",
"])\n",
"assert np.all(agent.tau == expected_tau)\n",
"\n",
"expected_values = np.array([\n",
" [0.0191, 0.271, 0.0, 0.0191],\n",
" [0, 0, 0, 0],\n",
" [0, 0.000183847763, 0.000424264069, 0],\n",
"])\n",
"assert np.allclose(agent.q_values, expected_values)\n",
"\n",
"expected_model = {\n",
" 0: {\n",
" 1: (2, 1), \n",
" 0: (0, 0), \n",
" 2: (0, 0), \n",
" 3: (0, 0),\n",
" }, \n",
" 2: {\n",
" 3: (1, 0), \n",
" 0: (2, 0), \n",
" 1: (2, 0), \n",
" 2: (2, 0),\n",
" },\n",
"}\n",
"assert agent.model == expected_model\n",
"\n",
"# --------------\n",
"# test agent end\n",
"# --------------\n",
"agent.agent_end(1)\n",
"\n",
"expected_tau = np.array([\n",
" [3, 2, 3, 3],\n",
" [3, 0, 3, 3],\n",
" [3, 3, 3, 1],\n",
"])\n",
"assert np.all(agent.tau == expected_tau)\n",
"\n",
"expected_values = np.array([\n",
" [0.0191, 0.344083848, 0, 0.0444632051],\n",
" [0.0191732051, 0.19, 0, 0],\n",
" [0, 0.000183847763, 0.000424264069, 0],\n",
"])\n",
"assert np.allclose(agent.q_values, expected_values)\n",
"\n",
"expected_model = {0: {1: (2, 1), 0: (0, 0), 2: (0, 0), 3: (0, 0)}, 2: {3: (1, 0), 0: (2, 0), 1: (2, 0), 2: (2, 0)}, 1: {1: (-1, 1), 0: (1, 0), 2: (1, 0), 3: (1, 0)}}\n",
"assert agent.model == expected_model"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "79c71f3b2858306fde14049a0383667f",
"grade": false,
"grade_id": "cell-0e614343c0d86b2d",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"### Experiment: Dyna-Q+ agent in the _changing_ environment\n",
"\n",
"Okay, now we're ready to test our Dyna-Q+ agent on the Shortcut Maze. As usual, we will average the results over 30 independent runs of the experiment."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7b694d2c1d02154058ad127123594b44",
"grade": false,
"grade_id": "cell-22a658123d08fafa",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Planning steps : 50\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 30/30 [03:58<00:00, 7.95s/it]\n"
]
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"# Experiment parameters\n",
"experiment_parameters = {\n",
" \"num_runs\" : 30, # The number of times we run the experiment\n",
" \"num_max_steps\" : 6000, # The number of steps per experiment\n",
"}\n",
"\n",
"# Environment parameters\n",
"environment_parameters = { \n",
" \"discount\": 0.95,\n",
" \"change_at_n\": 3000\n",
"}\n",
"\n",
"# Agent parameters\n",
"agent_parameters = { \n",
" \"num_states\" : 54,\n",
" \"num_actions\" : 4, \n",
" \"epsilon\": 0.1, \n",
" \"step_size\" : 0.5,\n",
" \"planning_steps\" : [50] \n",
"}\n",
"\n",
"current_env = ShortcutMazeEnvironment # The environment\n",
"current_agent = DynaQPlusAgent # The agent\n",
"\n",
"data_qplus = run_experiment_with_state_visitations(current_env, current_agent, environment_parameters, agent_parameters, experiment_parameters, \"Dyna-Q+\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3c8507e67b844c085afe5bd111f176cc",
"grade": false,
"grade_id": "cell-5d80afb4585b0357",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Let's compare the Dyna-Q and Dyna-Q+ agents with `planning_steps=50` each."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "56f9182c13c40b6647f53e95d2a89302",
"grade": false,
"grade_id": "cell-b17bc044f6e4e020",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"def plot_cumulative_reward_comparison(data1, data2):\n",
"\n",
" cum_reward_q = data1['cum_reward_all'][2]\n",
" cum_reward_qPlus = data2['cum_reward_all'][0]\n",
"\n",
" plt.plot(np.mean(cum_reward_qPlus, axis=0), label='Dyna-Q+')\n",
" plt.plot(np.mean(cum_reward_q, axis=0), label='Dyna-Q')\n",
"\n",
" plt.axvline(x=3000, linestyle='--', color='grey', alpha=0.4)\n",
" plt.xlabel('Timesteps')\n",
" plt.ylabel('Cumulative\\nreward', rotation=0, labelpad=60)\n",
" plt.legend(loc='upper left')\n",
" plt.title('Average performance of Dyna-Q and Dyna-Q+ agents in the Shortcut Maze\\n')\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "74b2b53a88c98b3a41f4ccdf24c585bf",
"grade": false,
"grade_id": "cell-bff6a7315a81ba36",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAswAAAFwCAYAAACsMS2JAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdd3gVVfrA8e8bQoeE3qLSV0QBRaq6GOxYEFkQCyUIIijuuuy6iuIPLGvB1ZUVRRERQRBFUWDBAksRlSIBC6JIF0LvJQRIOL8/zrlhuLlJbgqZ3PB+nidPkpkzM2famXfOnDkjxhiUUkoppZRSoUX5nQGllFJKKaUKMw2YlVJKKaWUyoIGzEoppZRSSmVBA2allFJKKaWyoAGzUkoppZRSWdCAWSmllFJKqSxowKxyRESqiMh4EdkqIkZE5vudJwUiUkZE/iMiv4tImohs9DtPKudEpI47r4b5nRelvEQk3h2bCQW4zEJ9PojIMJe/On7nRZ15vgfMIlJRRFLcQdfd7/yobL0EdAPeAHoA//Q3O8p5BHgQ+ABIAB7KKrE73wI/J0XkkIisF5FPRKS3iJQugDyfMSJyq4jMFJGdInLc3eB9JCJ/9Dtv+UVEEoL24wkR2SMiy0XkDRG53O885oWIxIjIE259DolIsoisEpHhIlLN7/wVVi6wHSYiFfzOSzhEpILLb7zfeQEQkWIi0kNEvhaR7S4+2SIi80TkKREpWQjy2KkgbyJE5GK3j+rkYJrAzYQRkS6ZpLnNk2ZYPmX3jIn2OwPA3UAJYAPQB3jP3+yobFwLfGGMecrvjKjTXAv8ZIx5OAfTfI+9AQIoA5wHXAeMBR4XkT8ZY37I32yeWSJSDHgHezO3ChgBbANqAz2Br0TkGWPME/7lMt/9B/gOWwESC1wEdAbuE5FJQG9jzHEf85djIvIH4AvsfpsKvA2cANpgbwZ7i8jNxpgl/uWy0IoHhgLjgP35ON+vgNLY/ZCfKmDzCzA/n+edG5OA24FvsOXjPmzZ2Ap4DHu+HfMtd1YnoBcwrICWdzF2H80HNuZw2hSgN/BRiHH3uPGl8pC3AlMYAuY+wDxgGvCKiNQ3xqzzIyMiIkBZY8xhP5ZfWIlIeWPMIfdvDWDvGV6GyrkawO85nCbJGBN8gzpERLoCE4HPRORCY8y+fMlhwRiGDZbHA32MMamBESLyAracGSIia40x7/qTxXy30Bhz2sVIRB7C3vjcBRwEBviRsdwQkTLADCAOuMUYM9MzerSIvA7MAaaLSBNjzM4s5lUHWxnT2xgz7oxl+ixgjDmJDW6KLBG5FBssf2KM6RxifHXgQIFn7NTyI/E6+Qlwu4jUMsZsDQwUkRrADcCH2HKq8DPG+PYDNAcMtuanMvau7RnP+GJAErA8k+nvc9N38gwrib0L/Bl7cu/HFr6XBE0b76ZNAB7A1kYdA4a58a2wd+i/AcnAIewd522Z5OVKYBFwFNiOrdm60C1jWFBawV7AEj3znge0D3O7Jbj5XoMNEDa5vP8I3JHJNC2wB+5ul3Y18DgQHZRuPvYOsh72jnBvYB3c7+CfBM+0fYHlbhscAL4ErgiRF+O27dXA18BhYL4bt9HloRn2ongY2An8C3uDV8r9neT271fABUHzLw88AyzxrO9a4HmgTBbHQW933Bxz2/QfmWzLS4ApwA6XdjPwPlA/KN01bhvsd3n9Eeifg/MjGtvUYpWbfo/bh01CHAvBP8OymbcB/pvF+GdcmiGedTZ4zs+g9LOwgVlZ9/84lz4WGOX2YQr2HGodNG0U9lj8CnvuHMcG/6OAyjnYXtXcsbcJKJVFmsPu+CkRxjzvd/swyeVrG/YpWJ0sjuu2wALgiDv+xgDlQqS/wm2Po+5YGomtHc52/wXt+y6ZjC8BrANSA/nF1o4ZoGGI9DVd2rdzs05AI+B17Dl0CFu2JQL3hrsP3XwedMt9IZv9YoAXs5lXHYLKqRzmpRa2lvF7bE1jCvZ8fAQolsnyPnbnwgHsDVpdXLkWIn1YZQSnysVGwEy3fQ9gy+gannTjyKI8ACoB/3bHRaBMSQQeDmNbxAdvS3JRfmYyz+CfjUH7bxhwM/ZJSgr2PHyRoOuXm6YhMMGlOe623Yu4simb/NzhljcwzONjmEt/PvAssMWt/w/AjSHSZ1umh1jvbm4fHXX7d34m28y7X2pgz/X1Lj87gdnAtcHHVHb7mcyv/ePC3DY3uv3wSND4f7jhHfAco0HneLZlL5kf8wYwQWlrYq8rv7t5bgVGA9XC2t+5KUTy6wd4DXvxClxkp2KDjyhPmuFuxS8KMf03wC6guPu/ODbwPIYt0AcAj2ILh2SgRYiD4nu3zP8D+gEd3PjngMXYwOFeN59f3DR3BeXjCrfM7djHFn9xeVuWyYHwHpCGbW86EPgbNtBMBTqGsd0S3HwTgV9d3h5xf2e4OLgD9hi2MBuMvdEY5/IwJSjtfOwFcTMw2W3DoUBToLub/1fu7+5APTfdC27cEuCvbntuwT6+uzFoGQZYiS30X3bb917PSbwWe4K/CvR3x4Vxy/gv8D/sRfVJ7AV8ddAx08jti9fcvngAexd7EtucJFThsBhbEzXE7ZPFmezrm9223Is9Nvu5df0GuNWTrp9b3rfAw9iT/xPCuMh75vGBS/+lW99/Yi+sh3E3gNgbm+7Y8+AXz35pms28DVkHzHVcmkWeYcvcPi0WlDYOe+yODlGILcbesA50x9EB7EWivCdtKbdeb2PPhf7u7+PAT4QR2Lr59CaLoN6TbqJL1y6Mea7HPqJ9xB2nI9wxt5WgYJ5T5cke7E3dfdgbKePdNi5ta+wFcxf2wvKQ21bLyaeA2aUZ5tLc5/4P3MQ/FyLto27cZblcp/7Y8/oF9/ffOHUeDQ5nH7r5LHDTNMgiTRl3fKzPZl6B4zgh3OUHTX8Dtlx4GVuOPIRtKmKAN4PSVsaWm8exwcoAd+xsdPt5flD6sMsIN4812HJtlNu+o9z0X3rSteVUefkQQeUBtuw8gb05uxdbPo4CZoaxLeKDtyU5LD9DzLO6y6dx+Q7kt1PQ/luKvSY85bbr5274Y0HzuxRbxmzCHvv3unU95rZz8Wzy09LNdwFQMYxtMsyz/gvduvwDWz4cJ2Nwl22ZHrTe32OvNf9069IN2/zuKze+OxmvxXWwQWYq9inT/e74+hjPTSjhB8xNgTfdsH96ltc2zG0TqKz7NWj8L26ftyB0nBRW2Ys95rsH/fzFbf8dnnTnue2yC1t5di+2rDqIPbdis93fuSlE8uMHe5Hci+cuBbjVbbgOnmGBAn540PT13fD/eIb91Q27PihtDPaOYr5nWOCg2EuIuwtC3I1iC+nVwKqg4UuxF796nmHFsUHUaQcCcJsb1i9oHtHYgGQDINlsuwQ3j03enYytzdvk1qm0Zztvx55gwbXJge0V7xk2n6xrEg1Bd5bYu+uT2NriEp7htbCFwUY8QRan7v6uCTH/jW5c16DhiW4Z07zbB/hz8D7H1qxlKBiBp13aViGOg61AhaB9vYvTA8bAsJ1AXIj5R7nfNd3xMClEmhHYG5X6weOC0l3r8vVB0Po2xRaEC0NstwyFXxbzN2QRMLs0B4E9nv/7uemCb4AeD7Fdx7lhrwel7YongHPDJHC8BqXt49LeHuY6veTSd84m3d8IsxaJ0OXA1W76fwQNN+4YbRM0fCY2SCnnGfYttkD/Q9Bxu5T8DZg7uzQvBS17KxnLg9/IWLblZJ1CbasobJlygGyCFc80e4CDYaT7yeUvQ+29J00d8hYwlyZEeYytwUwDanqGBSp37g5KGxg+3zMsR2UEp8rF24PSvuaGN/IMG+aG1QlKGxvqnMzBtogP3pbkoPwMYx9lOOY944541wdbZqwEtgWl/wFbcVQ+aHjgupvtcQBM9yxzNrbS7BaCnk4Gbev/cno5HQi8n/MMC7tM96z3CYKeoLrx4wiqPfWMm0WIOMiN81YsbSSMgNkNSyAoVghjOwa2TQu3/QzuZhy4zP1/M5kHzGGXvUFpSmDjnaN4gnps7LATOCcofQu3/TMcf8E/fvaS0RmoCHjbEc7ErtA9gQHGmJ+xwdLdIuLNb0/32zt9d+zJkui6P6siIlWwG3A2cEWIt//HmxBt4IwxRwJ/uy67KmMLgbnABSIS48ZVx54c04wx6z3Tn8AWfMG6Y2tWPw3KYwVsTVwd7COlcIwyxqS3p3J/v4HdrvFu8LXYu/h3gApBy5zl0lwXYt7/CjMPYG90BHtTk/5ykbHtlcZhX9y5JGiaH4wxczKZX5IxZkrQsK/dMl417ih3Frrf6dvMGHPcbX9EJNr1xFIF28QDbO1esHeMMfs980jG1hp498X1QBVs8JEUPANj2/gBdME2DXrbu71dHmZgg4irM1n3gNvc739619cY8yO2cL5CRKpmM4+8Ooi92QyYhK0J6RMY4Nr998a+cLg0xDz+HfT/XPfbu7+MMeaom18x99Z8FU/aUPsrlEBes2tjGBhfPrsZBsoBEYkSkViXrx/cPELla5ExZnHQsLnYG+I6bl7VsLUi04wxv3mWdZyM2yuvDrrf3v04GhuwdQgMEJF22H3ydoh5ZLtOkKHMLOXKzErY2rQY7JOfcMQQXjvRDPsxxPlW0Y0qFzxORIpntwBjzNHA+SciJUSkkpvvF9jzuIUn+S3Yx8bvB80mVFmamzJiqzHmw6BhgXOkQXbrgg0gjgGtc9LbQZjCKT/z4lNjzEbP/A32aXINESkHICJNsMHnJKBk0Db9GhsAh7rWBfsTtvZ3JfY6+jg2iN4uIn/LZJoRQeX0d9jrvHf9c1OmzzTG/BJGngEQkUrYpyKfG2O+CB7vuUYVtFnYirve7v/e2HPls8wmyEXZG/A29ql/gjFmkZtHLDY4nw6kBB0bG7FPtbM9Nvx86a8P9g50i4h4T/bZQFcRqWKM2e2GjccGn4H2XmADz5+NMYmeaS/A1gjsymK5VbCPzQJ+C5XIXdSewQaDobowqoC9GNV1/68OkSbUsAuwBfyOLPJYPbN8BQl1Iq1yv+t5lgf20UxWy/Pa5S38whDYBj+HGLfSk59lnuFZrd+GEMP2ZTIuMLyyd6CI3I99bHkhGbtPrEhG60MM2xM030DhtyJEWq/ANs/shgAybvNgdbE1e6H28UrscVmXrI/1vIrhVMCFMeawiLwPJIhINXejGY992pNZN3anbVdjzB4bY2fYX7dja34vwT6d8aroSVcV+25DQJoxJrANAnmNDWO9IOtzMLC8q7BNblqT8U3unBxHcGqdA+fmryHSrgoxLC8C63rQM+wD4BVsGTzDDeuDrfEeH2Ie4awTLnAZhn1p6twQ01R06UpgA2mvw+bUy9bBN2qZicGeI7s9wzI7H151P17tyaZXBhGJxjZV6YkNSiUoifcYqAssDQ5KjDE7RSS4PM1NGRHWfsiMMea4exl0BLBBRFZhA+5PjTH/y276bIRTfp6J+eOWcZhT2/RJ9xNKduVuoLJrJDDSVbBdim3W+CDwLxHZaowJvikKlb+9nL7+uSnTw4kDvALHaHbXqAJljEkTkQlAfxF5DNu0ZJQbHnKaXJS9iMhQbGz4f8aYDzyjzsfGAX3wVPgECbUPT+NLwCwidbGFlZD5AdEdW6iDvWP8F7bQ+lJsX6r1sG1bTps19jHdoCwWH1ygJofIn2AD8ws41WXTAexjst7YNzoDQVjovZ05cXnI6q3QlVmM8zIhhgXnJ/D/w9j2UKFsDfo/wzbJRk63QXbLSMvFuPQ8iMgg7OP5L7H7L9CeLA5b4x3qyUpWywxeRqjtHipdT+xddCjZnZy52ab5xtVAlce+yOo1Gtv2qwd2G/fB1lpNCDUfY0w4+6szNohbim17thn7uLoYtq2id399h31iEbCJU7WcgfOmObZtXGaau99rs0iDiLTEHkNrsUHTBmwtncG278/pcSRBv8M5f/OqqfudfvNujDkqIu9hu52rgT0XuwDTPTcfXuGsE9hy+mbsMfIVNmBIxQYbf+XU9roMWzvo9SSnushaCbQTkQbGmJD7yPWkcT6wKfA0ybk2KGl17DsjL3KqsiUgnC4TX+ZU/+b/xD4BPYE9hl4g998yyE0ZEe5+yJQx5g0RmQbchH1RvQswUEQ+MMbcEc48MhFO+ZkXOTmvXsKWG6HkqMcf9+Tra+BrEZmHPYb6kPEpQrblHPl/ncxqedldo7JKc6biwrHYOGQi9tqSaSVebspeEbkbW4ZMMMY8HTza/X6P01sleB3NbgX8qmHujV2BewndT+Qz2IPyFQBjzG4RmQXc5moxemLv1IK7xFoDVAXm5vHRQ1NsLw1PGWOGekeISN+gtIFC7fwQ8wk1bA3wB2CxyXv3dY2xjxi8AnfZgXytcb+PZNEEIq8C3QBe6Pk7oHFQfgpCD+xjlg7e40BEbsjjfANBxyXYJyGZCWzz3XnY5uuwTUAuwL457xXYpqFq4vNL4Dj3dumFMWaZiKwA+ojI29imVZ8aY/LS1WAPbIDc3j3KBUBEQj3Cvxv7FCnAW8jNdPPpLiJPG2My9JXqaqhvxQbaX2eTr7uwQXsHY0z6thaRsmRSwxGmwDlyQYhxoYbliqvJ7YG9mAc/nh2NfYmtJ7YyoAyhm2OEu6wK2GB5gjGmf9C4a4KS/0DGwNZbPnwMtMMeg49mssie2KZ2p10Dgs83T9ODVbk8F3sAXwUHk0FPRQM2Ag1EJCqo3KmGfSLplR9lRGayDJaMMduwL8WPEdtv+QTgThF5yTUlKGjhBHfhCGzTtDN0rQs0S4rL5fT5WaZnts3WuHHBTSBD2UvGJz1w6glYOMsLmzHmVxFZhD33vzXGhHoCH5CjsldErsCWX19z6trltRa7DiXycmwUeBtmse2QE7BtHscYYz4K/sHevV3k7jIC3sUW6t2xLw7NNp4+/Zzx2O5UQtYwu/bG4QjcLZ52RygiF3GqHRIAxpgd2KYGt4pIPU/a4tjasmDjsdv9uTzmEWCAa5sTmDYW2wxhP/YtX7AXyp3Ao659U/DySotItm05sxF4SeJhb7tAEamJvTnaRME+Ikpz+fHWYgYerebFl9jHv39z63YaOfVs6UNsreuTEuKLea49VnZfi/rU/R7smW/gGOwIfJ1JbWCeie2HOfCm92shkryFLfRfxQavY/K4yMD+Si+P3DoPCU5ojPnGGDPH8/ONZ9xObO1SbWCUCwa861UaGxyUwz6yy+6mOmQ5gO22Mtdlp8vnYmyZ8QdP/kpga2LzzK3rOOzF701jzKagPPyIrdG/B1s58TsZa2BzIrMysyZBFzBjzL6gfTjH+/4H9nj6DfhrqJtcEWmOLT+3Efr4zE9pZFynsoTeTzOwbcPvDBr+9xBp86OMyEygIua08l7suzhlvMPcE6AfQ6UvQCHzmwsrsE8n+nuvxQFi32fJchki0jCTmyGwHwuB3Debys8y/bCb9rT1cRUXnwEdQtyoeq9RYM+xRiIS5xlfEnsjHXJ55H0fPYp9mjQ4m3Rhl70iUh+7bbdge1fJ8JEmY8webDvqziLSJni8WNm+E+RHDfN12PZtWdVmfIytWu+DfQQLtvZoD/YxWAyhq9VHYO9eXnTtX+Zi28Odh32BIgXbFCQ7v2Db4/7DFTCrsbXC92FPyOZB6f+OrXH8Vmyn+gew7fhKuPHeBv4ficg72MdgzbGN/XcD52BfBGpA6Du8UHYDS0RkLPbA6u3WtW+gps4Yc0REemIPqNUu7VpsjUcjbA3hbeThC0vGmNUi8iI2yPpKRD7APnLphw1O7s7i0fyZ8BH2gvqZiEzFHi93kccvVBljkkWkj5v/ShEZg92WVbE1By9jX+TaIiIDsBf+X1zbrU0uXRNswduYLL6YZIyZLSIfYvsFrSgi/8XeDD6APY7/nJd1ceLk1OfoS3PqS3+t3Hp1zqQt+0TsI+7ubh3y2v7xI+yLNnNFZDy2DXMn7A1yTg3FBsy9gZZi21x7v/RXG/s2dKi2usE+wQZGs0RkNLZZz7XYJ1C7s5owDIOw59w3IvIa9ib3DnJXJv9RREphywDvl/6qYmtgM2tfPppTNztP5uWpnDHmkIh8ia3dP8qppjP3YWvNwm7L6s6zjtjH6jNF5GPstkrFHps9sI/WO7oKizPpI2zTlQ+w7Y2rY28y9oRI+wK2nHlHRFph26hfAVyOPV6814E8lxFZCNSEviAiE7HlxUrssbVARD5x/+/D3vgOwO6jhSHmdca59xrWAneIyDrsuwVHjDEzspk0eD5GRHpgr/s/umvdz9hypAH2nBiMvZHMTDPgAxFZgD3mtgBlse1ob8e+yJerr9zmc5m+GNt93+siEuixZomrjR2I7QnnMxF5F9tpQmm3Dhs51ZR1pMvLHBF5Axuv9CB0M5DvsE/1HxeRitgXKDeYHH5p0xjzFba5VnZyUvZOwpYvo7A3CsHLDDyFGoCtgf7KXWdWYIPvetinjuPJ7suJJhfdy+TlB/vBB0NQR90h0q3GXkRKe4a96qY9QIhuqFyaaOyB9x12px7BPqaYCFznSRdPFt3MYAv7Kdj2xsnY2pjbyLzLnquwB3EK9oQfgT1AQ3aBgj0wF2ID+hTsgTwV6BbGNkxw870Ge7f2O7a2YiWZ9HuJvYi+x6lOwHdgT6ongEqedPNxncZnMh9DJh2WY5vYrHDrcxB7E/HHHM5jI6G7uslsu9chqEsa7KOcwdigL9CJ/nDsxSE4babHAZl03YO9aH/KqY+i/I49aesFpbsce+Lv5FQn6fOwL7eF/LBGiGP5EewNXKDv508Jce5ktt2y2Y/en8PYi+an2IAg5Pnlmf5tN90TmYwPue0y2//u2Al05r8NG8xVyupYySZ/nbA1Crs4VYN9Ek8ZkIP5JHLqgx2TsTcWGbZ3Znklky6ZsM0OvnXrvBNbW5qbD5cEflKxQdAKbG85l2UzfVlOvZtRO9x9ldk6YV+oHuOO8xTs+yT3Zrb+YaxfDLZ8WuGOz8B6rsTThVk286hDmN2JZTJ9GezN4Sa3TmuwtWRXh5ov9qWtqdjA6iCnPlyyG5gVYv5hlRGhjjc3PD6TfPwD28zlROB4wgYV/8a+y7If25xpLbbpY80wtkWGZWW2fDduHJmUASHStsJ2w3rEzW9j0P7LcD6Q+TWhtjv+N7ptGvg4y3PAudnkoxr2ZvYzN/1Rz35/k6C+wTPLQ2b7jDDL9KzW242Pwr7XtYVT5Zt3v8S5bRD4QMcO7BOkq4Pm0wsbax3Hlv//wMYyoY6pXtgy+jhhlMuebdMim3SZdSsXVtnLqS4XQ/4EzbMK9nz+jVMftvsJG681zu44FTcTdQaIyJ+wNRR3GmMm5+N8E7DdxLU3xszPr/kqFS73JKUf9kKxxe/8ZEdEumFvmpdg+yfN6/sDEc89ft0GfGeMud7v/GTHNauagr2Q/s0Y87LPWQqL2O71dmObxvTPLr1SqnDysx/mIsO1fykVNKw49k41lTw0d1CqsHFt5btja8wKfbAMYGwXQ/cAbYDpodqNnoXuxr5A86bfGQmHMSYV2x3VLOAl16ShUMnkuAo8As/qRWGlVCHnZz/MRUlJYJNrL7Ya++irG7a9zQvGmO1+Zk6p/OBeTrkE+2iuHJm8uFpYGdtuOZy2y0WaiNyCa8uNfcQ6zdcM5YCxL/Tc5Hc+svCZiGzCvgheDNt042Zs05tPs5pQKVW4acCcP05gX0q8FfuWtGAD5weMMa/7mTGl8lEX7Et1ScD9xn1FSUWcV7GfrU/EviBckC/kFnUzsC+XdsK+aLUF23PLk7qdlYps2oZZKaWUUkqpLGgbZqWUUkoppbKgAbNSSimllFJZ0IBZKaWUUkqpLGjArJRSSimlVBY0YFZKKaWUUioLGjArpZRSSimVBQ2YlVJKKaWUyoIGzEoppZRSSmVBA2allFJKKaWyoAGzUkoppZRSWdCAWSmllFJKqSxowKyUUkoppVQWNGBWSimllFIqCxowK6WUUkoplQUNmJVSSimllMqCBsxKKaWUUkplQQNmpZRSSimlsqABs1JKKaWUUlnQgFkppZRSSqksaMCslFJKKaVUFjRgVkoppZRSKgsaMCullFJKKZWFaL8zoIqGKlWqmDp16vidDaXOKikpKQCUKlXK55wodfZJTEzcbYyp6nc+VMHQgDmCiUgXYBBwPlAW2ARMAIYbY467NBuB2kGT7jDG1AiaV2PgVaAtsB8YAzxpjEkLJy916tRh2bJluV8ZpVSOJSUlARAXF+dzTpQ6+4jIJr/zoAqOBsyRrTIwD3gRG+S2AoYBNYCBnnSTsMFwwHHvTESkIjAHWAXcCtQHXsI22RlyZrKulMorDZSVUqpgaMAcwYwxbwYNmiciMcADIvKgMca44duMMYuzmFV/oDTQ2RhzEJjt5jNMRIa7YUoppZRSZyV96a/o2QOUyOE0HYAvggLjydgg+sr8yphSKn+tW7eOdevW+Z0NpZQq8jRgLgJEpJiIlBGRK4A/A6M8tcsA94jIcRE5ICIfiUhwm+ZGwK/eAcaY34FkN04pVQilpqaSmprqdzaUUqrI0yYZRcMRoKT7ezzwsGfcNGAxsAW4ABgKLBSRJsaYAy5NRWwb6GD73DillFJKqbOWBsxFw2VAGexLf/8HjATuBzDG/MWTbqGIfAt8D/QGXvGM89ZIB0gmw+1IkX5AP4DzzjsvD9lXSimllCq8NGAuAowxy92fX4vIbuBdEXnJGJOhcaMxZqWIrAaaewbvAyqEmHUsoWueA/MaDYwGaNGiRaaBNcDBgwfZuXMnJ06cyHplVMQqXrw41apVIyYmxu+sKKWUUvlKA+aiJxA81wWyehvIG+D+SlBbZRE5F9u382ltm3Pj4MGD7Nixg7i4OEqXLo2I5HWWqpAxxnD06NH0foE1aC4Yup2VUqpg6Et/Rc/l7veGUCNF5CLsh04SPYM/A64XkfKeYd2Ao8CCvGZo586dxMXFUaZMGQ2WiygRoUyZMsTFxbFz506/s3PWqFmzJjVr1vQ7G0opVeRpDXMEE5HPsR8c+RlIwwbLfwM+MMasE5GbgO7Af4Gt2FrkIcDvwDjPrN7A9q4xVUReAOphP4Dycn70wXzixAlKly6d19moCFC6dNuPsYgAACAASURBVGltdqOUKlSMMXz6fRJNz6lA/arl/M6OilAaMEe274AEoA6QCqwHBmMDYIDNQDXsy30VsH00fw485g2EjTH7RORq7MuCM7Dtlv+NDZrzhdYsnx10PxesNWvWANCwYUOfc6JU4ZSadpIXv1zNmwvW069dPR678QK/s6QilDbJiGDGmCeMMRcZY8oZYyoYY5obY141xpxw4380xlxtjKlqjClujKlhjEkwxmwNMa9VxpirjDGljTE13bzTCn6tCq9hw4YhIogIUVFRVKxYkZYtW/L444+zfft2v7OXqW+++YabbrqJSpUqUbp0aZo1a8aIESNIS9PdG+lOnjzJyZMn/c6GUoXS/uTj9B73HW8uWM/drc/jkRv0swIq9zRgVioHYmNjWbRoEd9++y2TJ0+mc+fOTJgwgSZNmpCYmJj9DArYpEmTuPJK+7HGsWPHMmvWLDp16sTgwYPp2rWrBltKqSJpzY5D3PDKQhat28NznZvwTKeLKBalT8BU7mmTDKVyIDo6mjZt2qT/f/311zNgwADatWtHt27dWL16NcWKFfMxh6ckJSXRr18/unTpwuTJk9OHt2/fnjZt2nDjjTfyxhtvcP/994ecfuPGjdStW5cNGzZQp06dAsq1UkrlzY9b9tNz7FKKF4vi4wGX0ezcUL2mKpUzWsOsVB5VqFCB4cOHs27dOmbPnk3Lli3p3bt3hnS9evWieXPb/fX8+fMREebPn0/Xrl0pV64c9erV4/XXXz9tmkWLFtGxY0dq1apF2bJlufjii5k4cWJY+RozZgwpKSk8++yzGcZ16NCB+Ph4XnnllRBTKqVUZJqybDO3v7mIciWj+bi/Bssq/2jArFQ+aN++PdHR0SxevJi+ffsyZcoUDh8+nD7+8OHDfPzxxxkC6XvvvZdmzZrxySefEB8fzwMPPMDSpUvTx2/atInLL7+cMWPGMGPGDP70pz/Ru3dv3n///Wzz9NVXX9G0aVPq1asXcnynTp1Ys2YNW7dmaNKuIkSFChWoUEEDAqVOpJ3k0Y9/5OGPfuSScysy9f7LOK9yGb+zpYoQbZKhVD4oWbIkVapUYceOHQwaNIhBgwYxZcqU9AD5ww8/5MSJE9x1112nTXfnnXcyZMgQAOLj45kxYwZTp06lVatWANxxxx3paY0xtGvXji1btvDWW29x5513ZpmnpKQkLrgg8zfCa9eunZ6uVq1aGGNOexEw8HdaWhqpqanpw6OjtdgoLKpXr+53FpTy3YHkEwx8fzkL1+xmQHx9/n7d+dpeWeU7vfIpXzw542dWbc1zF8+50rhWDENvuTDf52uM/XhiTEwMXbp0Ydy4cekB87hx4+jYsSOVK1c+bZrrrrsu/e/ixYvTsGFDtmzZkj5s3759DB06lGnTppGUlJQexMbFxaWn8QazkPOANtAV3IIFC2jfvn2G8Q0aNDjtf23TrJQqLLYdOEqvsUvZuDuZf952EXe3ru13llQRpQGzUvkgJSWFPXv2pNf49enTh/j4eNats18nX7hwIbNmzcowXfDj9BIlSpCSkpL+f0JCAosXL+aJJ56gcePGxMTEMGrUKKZNm5aepnjx4qfNIxC4x8XFsWnTpkzzHBgX+FLcpZdeynfffZc+ftu2bXTs2JHp06ef9jW5WrVqZbElVEFavXo1AOeff77POVGq4K1MOsB9ExI5cPQE4+5pyWX1q/idJVWEacCsfHEmanj9NG/ePFJTU2nbti0A7dq1o2HDhrz77rsYY6hVq9ZptcnhSElJYebMmYwcOZL+/funDw/uCs4b5HpdeeWVPP3002zYsIG6detmGD99+nRq166dHgCXL1+eFi1apI/fuHEjAE2aNNEaZaVUoTJv9U7+/P4KypeMZnK/NlwUF+t3llQRpwGzUnm0f/9+HnnkERo0aMA111yTPvyee+5J7/WiZ8+eOe5u7tixY6SlpVGyZMn0YYcOHWL69OmnfVHPG+R69enTh+HDhzNkyJAMPWt8+eWXzJ07l6FDh+rX+ZRSEWXq8i08/NGP/KF6ecb0akFchdJ+Z0mdBTRgVioHUlNTWbx4MWCD18TEREaNGkVycjKff/75aUFxr169GDJkCKmpqSQkJOR4WbGxsbRs2ZKnnnqKmJgYoqKieP7554mNjeXgwezbf8fFxTF69Gh69OjBwYMH6du3L7GxsSxYsIDhw4fTvHlzBg8enON8KaWUH06eNAz/YjVvLFhH23qVGdOrBWVLahijCoYeaUrlwIEDB2jbti0iQkxMDA0aNKB79+48+OCD1KhR47S0NWrUoHXr1kDu25hOmjSJfv360bNnTypXrszAgQNJTk5m5MiRYU1/1113Ubt2bZ599lkSEhLYv38/AJ07d2b8+PGn1V4rpVRhdTz1JIM+/J7//riNzs3jePa2JpQqXjg+EqXODhJ4QUipvGjRooVZtmxZyHG//PJLlt2bFVV79+4lLi6OkSNH0qdPH7+zA9ga8uuuu45169axZMmSDEF+fjhb97cfdu3aBUDVqlV9zolSZ86+I8cZMDGRxev38sgNjeh/Zb1C0ZRMRBKNMaHbxKkiRz9colQ+O3ToEEuWLGHgwIGUL18+2/6SC1J0dDRTpkwhOjqaW265heTkZL+zpPKgatWqGiyrIu2XbQe57fVvWP77fv7drRkD4usXimBZnX20SYZS+SwxMZH27dtTu3Ztxo8fT5kyhetrU5UrV07v7k5FtkCPKVFRWvehip55v+7kgUnLKVcymol9W9OyTiW/s6TOYhowK5XP4uPj0aZOqiCsWbMG0H6YVdFijGHsNxt5ZuYqLqwVw9u9WlI9ppTf2VJnOQ2YlVJKKVUonEg7ydDpPzNpye9c1agaL3VtRsWyJfzOllIaMCullFLKf3uPHOeBictZtH4P/drV49EbGhEVpe2VVeGgAbNSSimlfPXLtoP0fXcZuw4f46WuzfjTpef4nSWlTqMBs1JKKaV8879fdvDQ5O8pWzKaj/tfRpNz9DPXqvDRgFkppSJUlSpV/M6CUrlmjGHctxt5coZ9uW90T/3MtSq8NGBWSqkIVblyZb+zoFSuGGN4YtpK3lv8O/HnV2XEHZcQW7q439lSKlMaMCulVIRKTU0F7AdplIoUx1LTGPzxT0xdkUS/dvV45IZGFNOX+1Qhp73dKxWmYcOGISKICFFRUVSsWJGWLVvy+OOPs337dr+zl6lvvvmGm266iUqVKlG6dGmaNWvGiBEjSEtL8ztrKo/WrVunH6FREeVA8gl6jV3K1BVJDLr2DwzuoMGyigwaMEcwEekiIt+KyB4RSRGR1SIyRERKeNKIiDwmIptF5KiIfCUiF4eYV2MR+Z+IJIvIVhF5SkSKFewaFX6xsbEsWrSIb7/9lsmTJ9O5c2cmTJhAkyZNSExM9Dt7GUyaNIkrr7wSgLFjxzJr1iw6derE4MGD6dq1a/qX4pRS6kzbvDeZbqMXsXyT/cz1n69uqJ+5VhFDn+NFtsrAPOBFYD/QChgG1AAGujSPAk8ADwO/AoOAOSJykTFmO4CIVATmAKuAW4H6wEvYG6ohBbQuESE6Opo2bdqk/3/99dczYMAA2rVrR7du3Vi9ejXFihWO+4ykpCT69etHly5dmDx5cvrw9u3b06ZNG2688UbeeOMN7r//fh9zqZQ6G0xdvoUhn66kmAhjE1pyRUN9YVVFFq1hjmDGmDeNMY8bYz4xxswzxrwAvAx0dzXLpbAB83PGmJHGmDlAV8BwKqAG6A+UBjobY2YbY94AngQGiUhMwa5V5KlQoQLDhw9n3bp1zJ49m5YtW9K7d+8M6Xr16kXz5s0BmD9/PiLC/Pnz6dq1K+XKlaNevXq8/vrrp02zaNEiOnbsSK1atShbtiwXX3wxEydODCtfY8aMISUlhWeffTbDuA4dOhAfH88rr7ySizVWSqnwHE89yXOf/cKgD3/golqxzPrLHzVYVhFJA+aiZw8QaJJxGRADfBgYaYw5AswAOnim6QB8YYw56Bk2GRtEX3lGc1tEtG/fnujoaBYvXkzfvn2ZMmUKhw8fTh9/+PBhPv744wyB9L333kuzZs345JNPiI+P54EHHmDp0qXp4zdt2sTll1/OmDFjmDFjBn/605/o3bs377//frZ5+uqrr2jatCn16tULOb5Tp06sWbOGrVu35nKtlVIqc3sOH+PuMYt5c8F67m59HhP6tuLcSmX8zpZSuaJNMooA19a4JNAc+DMwyhhjRKQRkAasCZrkF6Cb5/9GwFxvAmPM7yKS7MbNOFN5LypKlixJlSpV2LFjB4MGDWLQoEFMmTIlPUD+8MMPOXHiBHfddddp0915550MGWJbvcTHxzNjxgymTp1Kq1atALjjjjvS0xpjaNeuHVu2bOGtt97izjvvzDJPSUlJXHDBBZmOr127dnq6WrVq5Xylle+qVq3qdxaUCmnD7iPcM+47tu4/yn/uvISOzbSMUZFNA+ai4Qg2YAYYj22vDFAROGyMCe4OYR9QRkRKGGOOu3T7Q8x3nxuX/z57FLb/dEZmna0aTaDD8/k+W2MMADExMXTp0oVx48alB8zjxo2jY8eOGfrNve6669L/Ll68OA0bNmTLli3pw/bt28fQoUOZNm0aSUlJ6T1bxMXFpacJdC0WkNMuxvSlm8hVqVIlv7OgVAbfrt3Nfe8lUixKmHRvay6trcepinwaMBcNlwFlsC/9/R8wEgi8yWVCpJcQ4zJLF2q4HSnSD+gHcN555+Usx0VMSkoKe/bsoXr16gD06dOH+Pj49C6/Fi5cyKxZszJMV6FChdP+L1GiBCkpKen/JyQksHjxYp544gkaN25MTEwMo0aNYtq0aelpihc/vbP/QOAeFxfHpk2bMs1zYFzNmjVzsqqqEDl+/Dhgjxul/GaMYdLS33lyxirqVC7D271aahMMVWRowFwEGGOWuz+/FpHdwLsi8hK2hri8iBQLqmWuACQbY064//e5YcFiCV3zHFjuaGA0QIsWLTINrEM6AzW8fpo3bx6pqam0bdsWgHbt2tGwYUPeffddjDHUqlXrtNrkcKSkpDBz5kxGjhxJ//7904cHdwX33XffhZz+yiuv5Omnn2bDhg3UrVs3w/jp06dTu3ZtbY4RwTZs2ADA+eef73NO1Nnu6PE0npi2ko8St/DHhlUYccclVCqrN3Kq6NCAuegJBM91sd3IFQMaAKs9aRq5cQG/umHpRORcoGxQOhXC/v37eeSRR2jQoAHXXHNN+vB77rknvdeLnj175ri7uWPHjpGWlkbJkiXThx06dIjp06ef1oyiRYsWIafv06cPw4cPZ8iQIRl61vjyyy+ZO3cuQ4cO1SYZSqk82XEwhfsmJPL95v38+aoG/PXaP2i5ooocDZiLnsvd7w1AEnAQ25XcMwAiUga4BVcz7HwGPCwi5Y0xh9ywbsBRYEFBZDpSpKamsnjxYsAGr4mJiYwaNYrk5GQ+//zz04LiXr16MWTIEFJTU0lISMjxsmJjY2nZsiVPPfUUMTExREVF8fzzzxMbG8vBgweznT4uLo7Ro0fTo0cPDh48SN++fYmNjWXBggUMHz6c5s2bM3jw4BznSymlAlZvP0SPt5dw+Fgqb3Rvzg0XaRMvVTRpwBzBRORz7AdHfsb2hnE58DfgA2PMOpfmeeAJEdnHqQ+XRAGvemb1BrZ3jaki8gJQD/sBlJeDupo76x04cIC2bdsiIsTExNCgQQO6d+/Ogw8+SI0aNU5LW6NGDVq3bg3k/pH5pEmT6NevHz179qRy5coMHDiQ5ORkRo4cGdb0d911F7Vr1+bZZ58lISGB/fttC5vOnTszfvz402qvlVIqJ75Zu5v7JiRSvJgwpX9bLqwV63eWlDpjJPCCkIo8IvI0cBtQB0gF1gPvAG8E2ieLfS72GDAA+2XAZcCfjTErgubVGPuyYFtsu+UxwLAQPWyE1KJFC7Ns2bKQ43755Zcsuzcrqvbu3UtcXBwjR46kT58+fmcHsDXk1113HevWrWPJkiUZgvz8cLbubz+sXm1bWmkbZlXQPkrcwqMf/0jdKmUZ36cVNWNL+52lAiciicaY0G3iVJGjAbPKFxown3Lo0CFWrVrFiBEjmDNnDhs3bqRMmcLzpviePXto1aoVlSpVYsGCBfmet7Ntf/sp8MQguLcVpc4UYwyvzl3Ly7N/4/IGlXntruZUKHN2vtynAfPZRZtkKJXPEhMTad++PbVr12b8+PGFKlgGqFy5cnp3dyqyaaCsClLaScOQT3/i/aWb6XRxLV7o0pSS0Tl7mVmpSKUBs1L5LD4+Hn1yowpCoM/uUqVK+ZwTVdQdOZZK//cSWbhmN/3a1WNwh0baE4Y6q2jArJRSESrw8Rltw6zOpM17k+k1dikb9xzhmU4X0b1Nbb+zpFSB04BZKaWUUiEtXLOLgZNWYIxhYt82tK1f2e8sKeULDZhVgTDG6OO7s4A2RVGq6Jj+w1YemryChtXK80aPS6lbpazfWVLKNxowqzOuePHiHD16tNC9/Kby39GjRylevLjf2VBK5YExhlEL1jH889W0rFORd3q3olxJDRfU2S3K7wyooq9atWokJSWRnJysNZBFlDGG5ORkkpKSqFatmt/ZUUrlku0JYyXDP19Nx2a1mNCndeQHyz9/AiNbwtf/9jsnKoJF+FmgIkFMTAwAW7du5cSJEz7nRp0pxYsXp3r16un7W515NWvqZ4hV/kk5kcZDk7/n85+3c9+V9Xjk+kZERUVwU7q9G2DW32HtHChXAyrV9ztHKoJpwKwKRExMjAZSSuUzPadUftl35Dj3jl9G4u/7eOLmxtxzeZ3Ife/EGFjyBsx/Ho4fhnYPw5WPQDFtLqZyTwNmpZSKUMnJyQD6foDKk/W7DnPfhEQ27U1m5J3NualpBD+52PgNfDEYtv0ANZpC13FQWWuWVd5pwKyUUhFq8+bNgPbDrHJvyfo93D9xOQZ4J6Ellzeo4neWcufwLpjxF1g9E2LPhZtehhb3QKTWkqtCRwNmpZRS6iyTdtIwYs5vvDpvLedVKsM7CS2pV7Wc39nKOWNg2ViY909I3gOt7oOrhkApba6k8pcGzEoppdRZ5PCxVB6avII5v+zkpiY1Gd6lKWUjsSeM7T/BlN6wZw1UbgDdP4Zal/idK1VEReAZopRSSqncOHD0BAnvLOXHLQcYctMF9LmibuS93HfsMMx9GpaOhmIl4Kon4Iq/QlQxv3OmijANmJVSSqmzwPpdh+n77jI27U3m390upmOzWn5nKWeMgR/eh9lD4chOaNoNrn0aylf3O2fqLKABs1JKRai4uDi/s6AixGc/bePhj36keDFhUt/WtK5X2e8s5UxSon2pb/tPEHcpdHsPzmvtd67UWUQDZqWUilDlykXgS1qqQKWmneTFL1bz5lfraXZuBV6/uzlxFUr7na3wHTsE856Dxa9BiXJw87+heQJE6YeKVcHSgFkppSLU4cOHAQ2cVWg7D6UwcNIKlm7YS/c25/HEzY0pGR0h7XxPnoQlo2Dhy5C8G5rcDtc9o80vlG80YFZKqQiVlJQEaD/MKqPV2w/R+52l7Es+wb+6NqPLpef4naXwbf0eZv4NkpbZ5hd3TNLmF8p3GjArpZRSRciC33bxwMTllC5RjA/va0uTc2L9zlJ4jh2G+c/BopFQuhLc9qatWdbmF6oQ0IBZKaWUKgKMMbw+fx0vfbmaRjVieDuhBTVjI6C9sjGwehZMGwhH98IFt8DNI6BshL2YqIo0DZiVUkqpCHc89SRDPv2JD5dt4ZZmtXiucxPKRcLHSPasgym9bO8XlRtC57eg4TV+50qpDCLgbFJKKaVUZvYcPsaAictZumEvA+Lr84/rzy/8HyM5mWabX3z1IkSXgvjH4PK/QPFSfudMqZA0YI5gItIV6AFcCsQCq4F/GWPe96TZCNQOmnSHMaZG0LwaA68CbYH9wBjgSWNM2hlbAaVUnpx77rl+Z0H5bP2uw/Qe9x1b9x/lxS5N6doiAo6J3xfDx33hwGb7Ul+Xd6Bi8GVKqcJFA+bINgjYAPwV2A3cCEwSkSrGmFc96SZhg+GA496ZiEhFYA6wCrgVqA+8BEQBQ85Y7pVSeVKmTBm/s6B89M3a3dw/cTknjWFyvzZcWruS31nK2pHdMPv/4IfJUK66famvaTco7LXhSqEBc6S7xRiz2/P/XBGphQ2kvQHyNmPM4izm0x8oDXQ2xhwEZotIDDBMRIa7YUqpQubgQXtqxsTE+JwTVdA+StzCIx//SP2qZXmrZwtqVy7rd5aytuwd+OwRSDsGlybANcOgdEWfM6VU+DRgjmBBwXLACmwtcU50AL4ICownAy8AVwIzcpdDpdSZtG3bNkAD5rNJ2knDK3N+49W5a/ljwyq8fndzypcq7ne2MrfjZ5jSG3avhhpN4fp/Qt12fudKqRzTzg2LnsuwTSu87hGR4yJyQEQ+EpHgxmKNgF+9A4wxvwPJbpxSSimfHT2exoD3Enl17lq6XHoOY3q1KLzBcspBmPsMvHEFHN4O7f4B987VYFlFLK1hLkJE5Gps7fI9nsHTgMXAFuACYCiwUESaGGMOuDQVsS/6BdvnximllPLR4WOpdB+zhB+27GfITRfQ54q6hbcnjJVT4fPBNlBudDPc+CLE1PI7V0rliQbMRYSI1MG+3DfNGDMuMNwY8xdPsoUi8i3wPdAbeMUzzoSabSbDA8vsB/QDOO+883KZc6WUUln5eesBHpi4nN/3JjPijkvo2KyQBp/7NsKsh2HNl/ZLfV3HQeNO+lKfKhI0YC4CRKQS8BnwO9A9q7TGmJUishpo7hm8D6gQInksoWueA/MaDYwGaNGiRaaBtVJKqdz57KdtPPzRj5QqHsXEvm1oW78Qfv0u9Rh8+QR8NwaKl4YrH4Ur/qp9KqsiRQPmCCciZYD/AiWAm4wxR8Kc1Bvg/kpQW2URORcoS1DbZqVU4VG7tvZdW1SlnTQ8/9kvvLVwAxefW4GRd13CORULYTeCG7+2zS+2/wjn3wjXPwuV6vqdK6XynQbMEUxEooEpQEPgcmPMzjCmuQg4H3jTM/gz4GERKW+MOeSGdQOOAgvyN9dKqfxSqpTW4BVFB5JP8NAHK5i3ehe92tbm8ZsaUyK6kL2jfyIFpj8IP30I5WpA13fhwk5+50qpM0YD5sj2OvZjJX8BKolIG8+4FcA12CYa/wW2YmuRh2CbbozzpH0D+DMwVUReAOoBw4CXtQ9mpQqv/ftti6kKFUK1qFKR6JdtB7lvQiLbDhzl6VsvpEfbOn5n6XQn0yBxnG2CceIINL0DbvoXlCzvd86UOqM0YI5s17nfI0KMqwtsBqphX+6rAOwBPgce8wbCxph9roeNkdg+l/cD/8YGzUqpQmrHjh2ABsxFxZxVO3jw/RXElI5mcr+2XFq7kHVStPV7+OwfsHkJxLWA+MHQ8Bq/c6VUgdCAOYIZY+qEkezqMOe1CrgqTxlSSimVK+8t3sQT01ZyUa1Y3u7Vgmoxhai5zbFD8NW/YPEoiC4Jt/wHLukBUYWsmYhSZ5AGzEoppZRPTqSd5LlZvzL2mw1c1agar93VnNIlivmdrVMSx8HCl2D/77ZP5VtGQNkqfudKqQKnAbNSSinlg31HjnP/xOUsWr+HhMvq8PhNF1C8WCGptd2/Gf77EKydA9UuhDs/gPNv8DtXSvlGA2allFKqgG07cJQ+45axattBXuzSlK4tzvU7S9bJNPhmBHz9in2pL34wtHsYogpRrbdSPtCAWSmlIlTdutrfbSRatG4P/d9L5FhqGu8ktKR9o2p+Z8na9K3tU3nb93BuG7j1NajSwO9cKVUoaMCslFIRqkSJEn5nQeXQh99t5rFPfqJOlbK81bMFdauU9TtLcGQPfDEYfpoCpSvCbW9Cszv8zpVShYoGzEopFaH27t0LQKVKlXzOicpO2knDMzNX8c43G7m8QWVGdb+UmFLFfc5UKix/F2YOAgSa97Rf6itZzt98KVUIacCslFIRateuXYAGzIXd0eNp/H3KD8z8aRu9L6/D4zdeQLTfL/dt/Bq+eNw2v6h1CXQYDue28jdPShViGjArpZRSZ8iOgyncO34ZP245wMPXn88D7X1uE3x0P/zvSVg2FkqUh5v/DZf0hGIaDiiVFT1DlFJKqTNgzY5D9By7lANHT/BWzxZc27i6f5kxxgbJs4fC8UPQ5Ha44TntU1mpMGnArJRSSuWzQE8YJaKj+Kj/ZTSuFeNfZvZtgkm3w65foXJD6Dwd4pr7lx+lIpAGzEoppVQ+evvrDfxz5irOrVSGCfe05rzKZfzJyIkU+PZVmP8sRBW3fSpf/hAUL0Sf3VYqQmjArJRSEap+/fp+Z0F5JB9PZcgnK5m6IolrG1fnpdub+dcTRtJymNoP9qyBhtfb3i+0T2Wlck0DZqWUilDR0VqEFxab9yZz7/hlrN5xiD9f3ZC/XN2QYlFS8Bk5dghm/AVWToVy1aHbRGh0E4gPeVGqCNHSVimlItSePXsAqFy5ss85Obst3bCXfhOWcfKkYVzvVlz5h6r+ZGTlVNtV3KGtcGkCXDPMfohEKZVnGjArpVSE2r17N6ABs58+XLaZwVN/4pyKpXm7V0saVPPhox87f4Uvh8Da2VD9IujyNtS+rODzoVQRpgGzUkoplQuvz1/L8M9Xc1n9yrzRw4cv96WdgAUvwKLXALEv9f3x79qnslJngJ5VSimlVA4cS01j2PSfeX/pZm5uWpOXbm9GyehiBZcBY2DTtzDjz7BnLdSLtx8gqVSv4PKg1FkmR9/mFJHOIjJXRPaLyDER+U1EnhGRQtPzuYgME5HdOZymhJvu4qDhdUTEiMjN+ZtLpZRSkehgygkSxn7H+0s3MyC+Pv+545KCDZb3boAJt8G4G223cV3fhR6farCs1BkWdg2ziLwEPAS8A/wbOAg0BvoDFwK3nYkMFpASwFBgtjKzIAAAIABJREFUI/C9Z/g2oC3wqw95UkopVYis23WYe8Z9R9K+o7x8ezM6Nz+n4BZ+Mg0Wj7JNME4kw5WPQOv+UKZSweVBqbNYWAGziNwCDAL6GGPGekYtEJHRwHVnInN+M8YcAxb7nQ+llAqlYcOGfmfhrLFk/R76jl+GAO/e04rLGxTgg9UNC2HW3+2X+ur80Ta/qKL7XqmCFG6TjL8Cy4OCZQCMMWnGmM9EJN41X7jIO15E5ovIR57/x4nIMhG5SURWiUiyiMwUkUoi0kBE5onIEZemqWe6kM0jAvPLLOMiUlZERorIaresDSLymoh4v1N6yP1+xy3DuOWdtkwReVdEloZYxkAROSoi5dz/USLyqIis9TRd6ZXlFlZKqRyKiooiKipHLetULkxcsom7xiyhctkSTB94RcEFy/s2wft3wrs3w/Fk6DIWEv6rwbJSPsi2hllEigOXAS/l43LPA54ChgBlgFeB0UAd4C1gOPAcMFlELjTGmDwsqwxQDHgc2AWc6/6eAlzv0lwFzAWeAWa6YduAmkHzmgzMEpF6xpj1nuG3AzONMYfd/68Cvdw6LgeuBcaKyB5jzH/zsC5KKZVu165dAFSt6lO/v0XcyZOGF774lTcXrKfdH6ry2l2XUL4gesJIPQ5LRsHsoYCBlvfCtU9CibJnftlKqZDCaZJRGSgJ/J6Py60EtDXGrANwNckPA72MMePdMMEGr42AX3K7IGPMLmBA4H8RiQY2AF+LyHnGmN+B79zodcaYxZ60wbObDezBBsjPuzRxwBVuGCLSwC2vtzHmXTfdHBGpiW0nrQGzUipf7N27F9CA+Uw4cPQEf5/yA7NX7eDu1ufxZMcLiS5WALX56xfAtAfgwGao0QRuHgHnXHrml6uUylJOzv681PIG2xgIlp217vfcEMPi8rowEekhIitE5DBwAvjajfpDTuZjjEkFpgLdPIO7Akc4VTN9NXAS+EREogM/wP+Ai0Uk316nFpGuIjJdRJJE5LCIJIrInUFpREQeE5HNrtnIV8G9gbh0jUXkf67ZylYReSo/86qUUpEicdM+bnn1a+b9upP/u7kxz3S66MwHywe3wcd9YXxH279y13HQ/2sNlpUqJMKpYd4DHMM2o8gv+4P+Px5ieGBYqbwsSERuA8YDo4DHgL3Yphaf5HLek4F7ReQPxpjfsMHzdGPMUTe+CrYJyIFMpq8JbMnFckMZhK0t/yuwG7gRmCQiVYwxr7o0jwJPYGvwf3XTzBGRi4wx2wFEpCIwB1gF3ArUxzbBicI2m1FKqbPCJyu2MHjqT1QoXYIP7mvDpbXPcC8UJ0/Cd2/B7P+D1GPQ6j6If1R7v1CqkMk2YDbGnBCRb7DtfbMKnlLc7xJBwythg7m8ymr+WekKLDHG3B8YICJX5iEf84HtQDcRGQ+0xra3DtgLpAKXY2uag+3Mw7KD3WKM8W7buSJSCxsUvyoipbAB83PGmJEAIrII233eQE7tz/5AaaCzMeYgMNu9FDlMRIa7YUopVaT9539reHn2b1xauyKjujenWvk81ddkLykRZv0DkpZB5QbQbSJUa3Rml6mUypVwnzG9ArQI1dOD6xHiBk7Vml7gGXcucH6ec2ntxDan8M6/HLaf5KyUxtaQe90d9H/YtdnGmJPAR9ia5dux/VF/7kkyF1vDHGuMWRbi53jGueZOULAcsAKo5v6+DIgBPvRMcwSYAXTwTNMB+CIoMJ6M3XZ5ublQSqlC71hqGo998hMvz/6Nm5rU5L0+rc9ssHxkD8wZBm9dBTt/gRv/BQOXabCsVCH2/+3dd3hUVf7H8fc3CQkECBA6SOi9CUSliIjYFV1dRNFdV91dy67iqqu/tffehbUgrnVta0cFpEhHpSMCIfQekhBI7+f3x53IEEMIkGSSzOf1PPMkuXPune8cJfnk5NxzyrQOs3Nukpk9B7xhZkOAL4F0vBvyrsebk3yRmS0CHjazTLwwXjQF4pg55wrN7EvgFjPbgjd94zYgq/QzmQb828zuBn7Em7Ywoti1c81sEzDazFbhjWavLOWaH+GN0N4CfO4fgp1zcWb2Kt4KH08Bi/GCeE+gi3PuL2V+00dnMN7UCvD++xQA8cXarOHgedjdOHj+OM65rb7/jt3wAraIVDFdu5bXeETwyszN59p3ljBvfRLXD+vIP8/sUrHzlZe+461+kbUXup4L5zwFDdtU3OuJSLko805/zrnbzGwBXlB8H2/0cTPwFfCMr9nlwETgPbwR5zvwQmV5uRFv+bmXgRTgUbyA2KuUc14DOgA34wXXab46i29Icj3e+5iOtypI+1KuOR/YhrdE3YclPP93YB3wV7yl5VLxQuwbpVzzmJnZCLw5yNf4DjUC0p1zBcWapgCRZhbuC/uN+O288qJ2jSqqXhGRQEpMy+Hqt35i9c5UnrmkL6MGVODOfXvWwje3wpb50KwnXPQadD4Dfrsak4hUQXZsSxxLVWFm7fBG0Bc45y7yHbsb+KdzrlGxtn/F+8Uj3DdHPc/X7sVi7XYAbznn7j7Ea14LXAsQExMzYMuWLeX7pkSkVAkJCQA0b948wJVUPxsT07nqzUXsSctm/Jj+nN6jgvowJx2m/AtWfAiu0FtP+aQbILTM41VSRZnZEudcbKDrkMqhf7E1gJlFA5Px1sr+g99TKUB9MwstNsrcEMh0zuX5tWtYwqUbUPLIMwDOuQl4wZvY2Fj95iVSyfbt8/55KjAfmfnrk7jx/aWYGR/8dSD9YiroD2nrvoNJN0PaTug+Es54GKJL++OliFRVCszVnJlF4m2GEg6c57upr8havBsQOwFxfse7+Z7zb3fQ3Sa+GzbrFmsnIlJtOed4fno842fG06FpPSZeGUu7JhWwe15GEky+A1Z9Cg1j4I9fQMfh5f86IlJpFJirMd+GKP8DOgNDnHPFl6xbgDd/+hK8bb+LAvZIfCPDPpOB282svnMuzXfsUrwbKmdX3DsQEakc2XkF3Pa/FXyzchcXHt+Kxy7qTd2Icv4RmJcNC8fDnKehsAAG3wTD74Zadcr3dUSk0ikwV28v4636cTMQbWYD/Z5b5pzLNrMngHvNLIUDG5eEAOP82r4KjAU+M7Mn8W6SfAB4Tmswi0h1tykpg7//dymrd6Vy+1ld+dupHbHyvtlu42z44gZI3QHth8GZj0DLPuX7GiISMArM1duZvo8vlvBce7xVTJ7AC8h3Ao3xlrk7wzmXUNTQOZfiW2FjPN4ScvuA5/FCs4hUUSEhFbxdcw0wY00C//hoOWEhxsQrY8v/5r6cNJj+oLdbX73mMPod6H6BVr8QqWG0SsZhmNkzwCjnXLtA11KVxcbGusWLFwe6DBERAAoKHc9+F8fLszbQs1UUr/5hAG2iI8vvBZyD1V/CtHth31bofyWc+SjUjiq/15AqTatkBBeNMIuISI2yPzOPWz9ezoy1e7g0tg0PXtiT2rVCy+8F9m6Eyf+C+KnQtBtc9Q20O7n8ri8iVU5AArOZhQKh5blN9LEwszrOucPtGCgiUqXs2rULgJYtWwa4kqpjS3IGV0z8kV37s3nowp5cOahd+V08LwuWvQfT7oOCXBhxHwweC6G1yu81RKRKqpQJcGb2lpktNrPfmdkveFtPn2RmF/qOZ5vZbjN7ysxq+c7pYGbOzAb7XecD37E+fscmmdl/fZ/XNbPxZhZnZplmtsnM/m1mUcXqcWZ2q5m9YGaJwM++4w3N7H0zyzCzXb6NP0REqqTU1FRSU3VfbpHl2/Zxwfj5pOfk896fTyq/sOwcrJ8OLw+Cb/8JrfrBTUtg6G0KyyJBojJHmNsBT+FtFZ2Ad1Pam3hbV98FdAQexwvx/3TObfTtNDcUb3k0fJ9n+z6uNO825yG+8wEi8dYdvhtIxNu6+m68pdfOKlbP7cAc4I8c+MXhTeBU4B/AbuCfvrryj/3ti4hIRfli2Q7u+HQlzepH8NbVJ9CpWf3yuXDqTvj8etg0G+q3hMs+gK7n6KY+kSBTmYG5MXC6c265L+huBt5xzv2tqIGZ5QD/NrPHnXPJwFy8cPykmXUAWuIF7KHAv4HeQCNfO5xzicANftcLAzYB88wsxjm31a+e3c65S/3a9gR+B1zmnPvId+x7vN3zNIQjIlIFOed4fto6Xpq5noEdonn5igFE1w0/9gsXFsKiiTDrccjNgEE3wvC7ILwCNjoRkSqvMtck2uGcW+77vAsQA3xsZmFFD2AmUBvo5Ws3FxhiZiHAKcBKvGXPhvqePwXYC6wuehEz+6OZLTOzdCAPmOf3mv6+Kfb1Cb6PXxUdcM6lA9OO5s2KiEjFys0v5JaPlvPSzPX87vhWvHPNSeUTlhPj4K1zYfLt0KQL/GU6nPWowrJIEKvMEeYEv8+b+D5+e4i2bXwf5wAN8QL0ULwAPR9o4RtxHgrMc7618czsIuAd4BW8aRp78UalP8cL4oeqB6AFkFbCzX/Fd88TEakSwsKCd6GjpPQcbnp/GQs3JnPbGV248bROx74ZSV62N6L846uAwcgXod+VoPWuRYJeZX639V/wea/v47XAshLabvJ9/MXXdijeaPKdzrlUM1vpOzYUeM7vvEuAH4tN8xhWhnrAm7Ncv4QVM5od+i2JiAROx44dA11CQPywMZmxHyxjf1Yez43uy8X9jzu2CzoHy9+H6fdDRqK38cg5T0GUVh8REU+ghifigB1AO+fc64dq5JxzZjYfGA10whtxxvfxGrzR47l+p9QBcopd5ooy1rTI9/ECoGgOcz3gDDSHWUQk4JxzvD53I09MXku7xnV5+5oT6d7yGDcK2b3KC8rrp0PT7nDhy9DlzMOfJyJBJSCB2TlXaGa3Ae/6lnybDOQCHfBuvBvlnMv0NZ8DPA3EOeeKpkfMBcYCmcBSv0tPw7tp8G7gR+BcYEQZa/rFzL4CXvHVtAtvJY3M0s8UEQmMHTt2ANC6desAV1LxsvMK+L9PV/Ll8p2c27sFT43qS72IY/gRlpEMU/4FP38MIWEw4n4YeAPUqlN+RYtIjRGwCXDOuY/MLBVvrvE1QAGwEfgaLzwXKRpBnlPCsR+dc3l+x1/DC903481ZngZcDvxQxrKuwpv//AKQjrcSxyJgVBnPFxGpNOnp6YEuoVIkp+dw/XtLWLQ5hX+c3pmxp3UmJOQo5ys7B1sXwqd/gdQd0GsUnHY3RHco36JFpEYx3/1yIsckNjbWLV68ONBliASVuLg4ALp27RrgSirOqh37ue7dJSSl5/Ds6L6c36fV0V8scZ238sXGWdCwLYz6DxwXW261SnAxsyXOOf0PFCSC9xZrERGpspxzfLZ0B/d8sYoGdWrxyfWD6X1cg6O7WGEBzH3WWwHDFcKJ18Gp/4LI6PItWkRqLAVmERGpUtKy87jloxVMX5PACe0aMW5Mf1o0KL4yaBnFTfGC8q7l0OYkuGAcNK25I/IiUjEUmEVEqqnw8HLYpKOK2ZCYzp/fWsT2lCzuOrcbfz65A6FHM185Iwm+uxdWvA/h9b2g3P/K8i9YRIKCArOISDXVvn37QJdQrmbF7eGmD5YRHhrCf/9yEid1aHzkFykshJ//B9/+E3LSYMjNcPItUKdR+RcsIkFDgVlERAKqoNDx4ox4xs2Mp2vz+rx+ZSxtoiOP/EK7VsLn18Ge1dCkK1z9LbToXf4Fi0jQUWAWEammtm3bBkCbNm0CXMnRS07P4fZPVjJz7R5GDTiOhy/sRZ3w0CO7SEayN6K8+ksIr+etqTzo7xAWUTFFi0jQUWAWEammMjOr775Kzjm+WrGTJyevJTE9hwdG9uBPg9thdgTzlZ2DpW/D9Achay/0/xMMvwvqt6i4wkUkKCkwi4hIpUpKz+GWj5YzNz6Jbi3qM+7yfgxoe4RLvCVvgC/+Btt+gOiOcNn70HZQxRQsIkFPgVlERCrNpBU7eXDSL6Rm5XPf+d6o8hGtgpGbCVPvhCVvgYXCKbfDKXdAWM1bMUREqg4FZhERqXDpOfnc+8UqPl+2gz7HNeCtq3vTq/URbETiHKz9GqbeDfu2QJdz4MxHoEmniitaRMRHgVlEpJqqXfsoN/OoZCu37+OmD5axbW8mt5zehb8P70hYaEjZL7BrBXx2HSSu8Va/+NMkaH9KxRUsIlKMAnM1Z2adgNuBgUAvYK5z7tRibTYDbYudmuCca1GsXQ9gHDAI2AdMBB50zhVUSPEickzati3+z7pqKSx0TJy3kaemxNGsfgQfXTeIE9odwVzlzL3eltYLx0NEAzjtXm/1i1p1Kq5oEZESKDBXfz2Bc4EfgNIm8b2PF4aL5Po/aWaNgOnAauBCoCPwLBAC3FOO9YpIEEhMy+G2/61gzrpEzu7Zgid+35uGkWWcZ5ybAcvegxkPQ24a9B4N5zwJkUd4Y6CISDlRYK7+JjnnvgQws0+AJodot8s590Mp17keqANc7JxLBaaZWRTwgJk95TsmIlXIli1bgKo30jx7XSK3fbyctOx8Hr2oF5efGFO25eKcg/XT4etbYP82iBkM5z4NLXpVfNEiIqVQYK7mnHOF5XSpc4CpxYLxh8CTwDBgUjm9joiUk+zs7ECXcJDM3HzGzVzPK7M20LlZPf77l4F0bVG/bCdvWQBznoYNM6FeCxj9LnQ7H0KOYK6ziEgFUWAOHteY2VggC5gG3Oac2+L3fDdgpv8JzrmtZpbpe06BWUQOaUNiOte/u4T4Pelc1K81j/yuF3UjyvAjJnMvTH/A24AEYMR9cNINEH4UW2OLiFQQBebg8CXeHOftQHfgfmCumfV2zu33tWmEd6NfcSm+50RESvTl8h3c/fkqwsNCePuaExnWpenhT8pOhR9fhfkvQm469L3cWyaubuOKL1hE5AgpMAcB59zNfl/ONbMFwHLgauAF/6YlnG6HOI6ZXQtcCxATE1M+xYpItbEnLZs7PlnJrLhEBrRtxEtj+tG6YRlWsFjxIXz/mLeecofh3uoXxw2o+IJFRI6SAnMQcs6tMrM4oL/f4RSgYQnNG1DyyDPOuQnABIDY2NgSQ7WIVJzIyMBNW5i/PombP1zG/qw8bjujC9cN60h42GHmG+9e5e3St2kORB0HYz6CLmdBWW4IFBEJIAXm4OYfctfizVX+lZm1Aer6nhORKqZNmzaV/pr5BYW8NHM9L82Ip0PTurz/14F0aX6YG/vyc70b+uY8BRFRMOJ+GDwWQvUjSESqB323CkJm1gvoCrzmd3gycLuZ1XfOpfmOXYp3k+DsSi5RRKqgjYnp3PrxCpZv28fF/Vrz6EW9qRMeeugTnIMVH8Csx2HfVujxO2895fotDn2OiEgVpMBczZlZJN7GJQCtgSgzG+X7+ltgOPAH4GtgJ94o8j3AVuAtv0u9CowFPjOzJ4EOwAPAc1qDWaRq2rRpEwDt27ev0NdxzvHfH7fy6DdrCA8LYdyYfozs2+owxc2FKXdCws/QrIe3TFyPCyq0ThGRiqLAXP01A/5X7FjR1+2Bbb42L+DNUU4GpgB3+Qdh51yKmY0AxuMtIbcPeB4vNItIFZSbm3v4RsdoT2o2d3zq3dg3tHMTnh7VlxYNah/6hH1bYfaT3k59oRHeyhcnXgdhZdzlT0SkClJgruacc5vxVrIozYgyXms1cNqx1iQi1Z9zjimrdnPX5z+TmVvAgxf05I8D2xIScohvNxnJMPMhLygX5kOXc+D85yGqZeUWLiJSARSYRUTkIHtSs/nnJyuZsy6R3q0b8Pylx9OpWb2SG2elwNR7YMX7YCHQ9zIYdBM061ZyexGRakiBWUREAG9U+asVO7n781Xk5Bdw97nduWpIO2qFlrBcXPZ+mPOMt/lIQa63jfXQW6G11lMWkZpHgVlEpJqqV+8Qo75HIW53Gg9O+oUFG5Lpc1wDXrysH+2b1P1tw4wkmPc8LHkbctOgw6kw/B5oc0K51SIiUtUoMIuIVFOtW7c+5mvk5Bfw4vR4XpuzkfDQEB4Y2YM/DmpHaPG5yjlp8NPrsOAlbxpGu6HeesoKyiISBBSYRUSC1Pdxe3jwq1/YnJzJJQOO446zu9G0fsTBjQryYcmb3lrKmcnQ8TRvK+vW/Uu+qIhIDaTALCJSTW3YsAGAjh07HtF5e1KzeWDSL3z78246Nq3L29ecyLAuTQ9ulJ8LiyZ6I8ppu6DNQLhoAnQaoa2sRSToKDCLiFRT+fn5R9S+oNDx/k9beWryWnLyCxl7Wif+NrwTtWv57dZXWABrvoKZj0JyPDTt7i0P1+VsBWURCVoKzCIiQWDBhiQe/Go1cQlpDOnUmIcv7EWHpn43DToHS96Chf/2BeVucMnb0H0khJSy/bWISBBQYBYRqcF278/miclr+GL5Tlo1qM2/L+/Pub1bYEWjxc5B3GSYfj8krYNG7eH8F+D4yyEsovSLi4gECQVmEZEaKDuvgHcXbuGlGfHkFhRyw6kduXlE54OnX8RPh+8fhZ1LoU4jOPcZiL1GI8oiIsUoMIuIVFNRUVG/OVZQ6Ph0yXZenBHPjn1ZDO3chEd+14u2jf3WVN44G75/DLb9AJGN4Zynof+VUKt2JVYvIlJ9KDCLiFRTLVu2/PXzgkLHpBU7eWlGPBuTMuhzXAOeuaQvgzo2PnDCloUw40HYuhDC63vLw510HUTUD0D1IiLVhwKziEg15pzju9UJPD9tHWt3p9GzVRSvXNGfs3v5zVPevgRmPwHx30FEFJx6Jwz8G9T+7Qi1iIj8lgKziEg19en3i5i0YhezdofSJroOL1x6PCP7tjqwS9+OJTDlLm/qRXh9GPYvGHIzhEcGtnARkWpGgVlEpBrJLyhkbnwSE+ZsZPvmDTSKrMXDF57IZSfGUCs0xGu0+2eY9QSs/RoiGsAZD0HfMVCvWWCLFxGpphSYRUSqgZz8Ar5ctpOXZsazPSWLJvUiuPqENgzv1owe3dt5jXatgIUvw8oPvRHlIf+AwWOhbuNSry0iIqVTYBYRqcKy8wp474ctTJizkT1pOfQ9rgH3nNed4d2asXnDeq9RUjzMeAjWTPKWhBt0I5x8q4KyiEg5UWAWEamCCgsdX63YydNT49ixL4vBHRvz1Kg+DOvS9MDNfPu2wdK3IeErCKsDJ/zZu6GvbpPAFi8iUsMoMIuIVCHOOWbFJfLIN6vZkJhB79YNePL3fTi5s18ITtkM0x+k4S9zvK/7X+lNv2jcMSA1i4jUdArMIiJVxMy1CYybuZ5lW/fRJroOL152POf1bklY0c18WSkw81FY9DoAzbtfAKc/oKAsIlLBFJhFRALIOcfSrft47Ns1LNmSQuuGdXj0ol5cMqAN4WG+oJy9HxZNhNlPQ34WtD0ZzngQjosNbPEiIkFCgVlEJEAWb97Lk1PWsmhzCo3rhnP/yB5ccVLbA0E5Pwd+fBXmvwiZyV5QHnYHdBgGQFxcHABdu3YN1FsQEQkKCswiIpVsxbZ9PDllLQs2JNO0fgQPjOzB6BPaEBnu+5ackw4/TYAFL3nTMFoeD5e85QXmkJCA1i4iEowUmKs5M+sE3A4MBHoBc51zpxZrY8CdwA1AE2ARMNY5t7xYux7AOGAQsA+YCDzonCuo4LchUuM551i4MZn/zNvMjLUJNK4bzt3ndufyk2KoG+H7VlyQB0vegpkPe9MwmveC370CXc8JaO0iIsFOgbn66wmcC/wAhB+izb+Ae/GC9VrgVmC6mfVyzu0GMLNGwHRgNXAh0BF4FggB7qnINyBSkznnmBufxLiZ8SzanEKjyFrcOLwT157Sgfq1a3mNslNh7Tcw9U5vRLlZT7j4deh8JhQtISciIgGjwFz9TXLOfQlgZp/gjSD/ysxq4wXmx51z433HFgKbgRs5EIavB+oAFzvnUoFpZhYFPGBmT/mOiUgZOeeYvz6Zp6auZeX2/bRqUJv7R/ZgzIkx1K4VeqDhqs9g6l2Qtgsad4ZznobeoxSURUSqEAXmas45V3iYJoOBKOBjv3MyzGwScA4HAvM5wNRiwfhD4ElgGDCp3IoWqeHW7ErlsW/XMDc+iVYNavP4xb25qF/rA0E5LwtWfQrznofk9dCgDYx8CfpeBmERZX6d6OjoCnoHIiLiT4G55usGFADxxY6vAS4t1m6mfwPn3FYzy/Q9p8Aschh70rJ5duo6Pl6yjfoRYdxzXnf+OKgtEWG+oFyQD8vegTnPQOoOiDrOG1GOvRpCax3x6zVt2rSc34GIiJREgbnmawSkl3DjXgoQaWbhzrlcX7t9JZyf4ntORA5hf2YeL89ez9sLNlNQ6PjzkPbceFonGkb6bisoLPBGlGc8BPu3QdPucMlj0O18CD36b8OFhd4fmEK0coaISIVSYA4OroRjVsJzh2pX0nHM7FrgWoCYmJhjqU+kWsrMzefV2Rt5Y+5GMnILuLh/a8ae1pl2Tep6DZyDdVPgu3u8qRdNu8Hod72gXA4hNz7e+8OR1mEWEalYCsw1XwpQ38xCi40yNwQynXN5fu0alnB+A0oeecY5NwGYABAbG1tiqBapibLzCvjgp628MmsDe9JyOK9PS/5+aid6tIryGjgHa7+GBeNh2w8Q1Roueg16j9Y6yiIi1ZACc823FggFOgFxfse7+Z7zb9fN/0QzawPULdZOJGjl5hfy5vxNTJy3icS0HAa0bcTLV/Qntp3fzXdrv4Xp90PSOmjUDs56DAZcDeGRAatbRESOjQJzzbcASAUuAR4BMLNIYCS+0WGfycDtZlbfOZfmO3YpkAXMrrxyRaqe/IJCJq3cyQvT49mSnMnQzk146bJ+DOwQjRUt/7ZjKUz+P9j+k3cz38iXoO8YCDvU8ugiIlJdKDBXc77we67vy9ZAlJmN8n39rXMu08yeAO41sxQObFwSgrerX5FXgbHAZ2b2JNABeAB4TmswS7ByzvHVip28OD2ejUkZdG8ZxZtXncDwbs0ONNq/HWY8DCs/hNoN4dS7YPBNGlEWEalBFJirv2bA/4odK/q6Pd6m8xxgAAAeOklEQVQGJU/gBeQ7gcbAYuAM51xC0QnOuRQzGwGMx1tCbh/wPF5oFgkqzjmm/pLAuJnx/LIzlc7N6vHqH/pzZo8WhIT4RpT374C5z8Kyd72vB98Eg2+GepW31FuTJk0O30hERI6ZOad7teTYxcbGusWLFwe6DJFjUljo+G71bl6etYGV2/fTvkldbhzeiYv6tT4QlPduhDnPwooPwEKgz2g45XaIbh/Y4kWkUpnZEudcbKDrkMqhEWYRCXrZeQV8uXwHb8zbxLqEdNo1juSxi3ozOvY4wkJ9q1qk7oJZj8HSdyCkljc/edjt3o19AZKfnw9AWJi+lYuIVCR9lxWRoJWdV8DbCzbz+txNJKXn0K1FfV649HjO79PyQFDOSoH5L3pLxBXmQe9L4NQ7oXHHwBYPbNiwAdA6zCIiFU2BWUSCTnZeAe8u3MIrszewNyOXIZ0aM254sVUvMpJh/guw6A3Iy4BOp8OI+6Bl38AWLyIilU6BWUSCRl5BIZ8s2c5z09aRmJbD0M5NuO6Ujpzc2e/muexU+PE1WDgOsvdDpzPg5Fug3ZDAFS4iIgGlwCwiNV5BoWPKqt08MWUN2/Zm0S+mIePG9GNgh8YHGuVmwrL3YPYTkJkMXc6Gk2+FmJMCV7iIiFQJCswiUmPlFRTyxbIdvDJ7AxsTM+jSvB5vXnUCp3ZtemDqRWEBLHkT5j4HqTugVT+47AMFZRER+ZUCs4jUOAWFjq9X7uSlGfFsSMygW4v6jL+8H2f3bHHgZr78HPj5E5j3HCSvh+gOMOZDb2S5KExXcU2bVt6azyIiwUyBWURqjNz8Qj5bup1XZ29gc3ImXZvX59U/DOCsns0PjCg7B8vfh5mPQNpOaN4LLhgP/f8Y2OKPQnR0dKBLEBEJCgrMIlLtFQXl1+ZsZFNSBr1bN+DlK/pzdk+/nfkA4qfD1LsgKQ4atIFL34Ou50FISOCKPwa5ubkAhIeHB7gSEZGaTYFZRKqtwkLHF8t38Pz0dWzbm0XPVlFMvDKWEd2bHRhRBti2CKbdC1sXQp1oOO9ZGHA1hIQGrvhysGnTJkDrMIuIVDQFZhGpdvILCpm0cifPTF3Hjn1Z9GgZxStXdOes4iPKO5fDjIdgwwyIaADD74GBN0BEvcAVLyIi1Y4Cs4hUGxk5+Xy2bAdvzt/ERt/NfC+N6cfIPi0PHlFOjIPpD0LcN1Ar0ltHefBYiNScXxEROXIKzCJS5SWn5/Dm/M28vWAzaTn59GwVxStX9P/tiPLejTD/JW+ZuNBwGHAVnHYf1G18yGuLiIgcjgKziFRZ+zPzeHn2et5buIXMvALO6N6c64Z1oH9Mo4NHlNMSvDnKP38CFgIn/AWG3gZRrQJXvIiI1BgKzCJS5aRm5/H6nI28vWAzqdn5XNC3FWNHdKJTs/rFGu6EOc/A4v94N/Cd+FcY8g+IahmYwitZ8+bNA12CiEhQUGAWkSojO6+AdxZu5t/fbyA1O4+zerRg7IjO9GgVdXDDrBSYches/Ahw0O8PMORmaNI5EGUHTMOGDQNdgohIUFBgFpGAKyx0fLhoGy9MX8eetByGdm7C/53djV6tGxzccN9Wb0R5+X/BFULv0d4Nfc26BabwAMvOzgagdu3aAa5ERKRmU2AWkYBxzjFz7R6enLKWdQnp9DmuAc+O7svQzsW2fE5LgOn3w5pJkJsO3S/wpl4cNyAwhVcRW7ZsAbQOs4hIRVNgFpGA+HFjMs9+t46fNu+lQ9O6jBvTj/OLLw+Xvsfbwnr5+1CYD93OgxH3Q9MugStcRESCjgKziFSqBeuTeOa7OJZu3UeTehE8eEFPxpwYQ3iY3/bU+7fDvBdg8RtgodDr93DKP4NujrKIiFQNCswiUikWbd7L89PWsWBDMq0b1uHBC3py6QltqF3Lb3vq9ERY8CIsfgty07ygPPQ2aN4zYHWLiIgoMItIhXHOMSc+ifEz41m0OYUm9cK557zujDkxhroRft9+svbBj6/BvOchPws6nwVnPQZNOgWueBERER8FZhGpEHPWJTJ+5np+2ryXZvUjuH9kD0bHtjk4KOdmwA8vw4JxkL0fOp0OI+6Dln0DV3g10rJlcKw3LSISaArMQcDMrgLeLOGpG5xzr/raGHAncAPQBFgEjHXOLa+sOqVm2JKcwT1frGJufBKtG9bhgZE9uPyktgfPUc7LhgUvwaKJkJ4AbYfAmY9A6/6BK7waioqKOnwjERE5ZgrMweU0IMvv641+n/8LuBe4HVgL3ApMN7NezrndlVeiVFfbUzKZMGcjH/y0lfDQEO44uytXD25PnXC/OcoFed6KF1PuhLwMbyT59xOh3VDwXx1DyiQzMxOAyMjIAFciIlKzKTAHl0XOufTiB82sNl5gftw5N953bCGwGbgRuKcyi5TqJS07j3d/2MJLM+IpKHSMGtCGf5zemeZRfptpFOR5m43Mfwn2boDmvWH4XdDt3MAVXgNs27YN0DrMIiIVTYFZAAYDUcDHRQeccxlmNgk4BwVmKUF2XgGfLt3Ok5PXkpqdz7AuTXnkd71oE+032llYCBtmwPQHIGEVNGoHo/4DPS/WiLKIiFQbCszBZYOZNQY2AM85517zHe8GFADxxdqvAS6txPqkGnDOMWXVbh6bvIZte7MY0LYRd5/Xnf4xjQ5uuH4GzHgIdi2HhjFw8URvmbiQkJIvLCIiUkUpMAeHXXjzk38CQoExwKtmFumcex5oBKQ75wqKnZcCRJpZuHMut1Irlirp5+37ue+rVSzbuo/Ozerx+pWxnN692YHd+QoLYeP3MO1+SPgZoo6Dc56G48dARP3AFi8iInKUFJiDgHNuKjDV79BkM4sA7jGzF4ualXCqlfIcZnYtcC1ATExMOVUrVdG6hDSe+24dU37ZTeO64Tx+cW8uGXAcYaF+o8UrP4Y5z0BSHNRuCKc/ACdeB+G6IU1ERKo3Bebg9QkwGmiHN5Jc38xCi40yNwQynXN5JV3AOTcBmAAQGxtbYqiW6ss5x8y1e3jvhy3MWpdIVO1a3HRaJ649pQP1a9fyGuVmwE8T4OdPfSPKreGMh2DA1VBbS55VtNatWwe6BBGRoKDALA5vGblQoBMQ5/dcN99zEkSy8wr4ZMl23lm4mXUJ6TSPiuDG4Z24clA7mtaP8Bql7oKF4+Gn16EgBxp38qZeDLgKwsIDWX5QqVevXqBLEBEJCgrMwev3QBKwBW+OcypwCfAIgJlFAiPxjSBLzZeek8/rczby3x+3kpSeQ89WUTw3ui8j+7aiVtHUi92rYP4L8PP/wEKgyznQZzR0HwkhoaW/gJS79HRvlUgFZxGRiqXAHATM7FO8G/5W4o0kX+p7jHXOFQLZZvYEcK+ZpXBg45IQYFxgqpbKsikpg7cXbOaTJdtJz8lnRLdm/HloewZ1aHzgZr7N82HW47B5LtSKhOP/ACf/A5p0DmzxQW7Hjh2A1mEWEaloCszBIQ64BmiDdyPfauBK59y7fm2ewAvIdwKNgcXAGc65hEquVSrJuoQ0Xpm1gS+W7yAsxDivd0uuGtKe49s09BoUFsL66V5Q3rEE6jSCof+EgX+Duo0DW7yIiEglUmAOAs65u4C7DtPGAY/6HlJD5RUUMmddIm8t2Mzc+CQiw0O5enB7rh/WgWZFO/Pl58Kyd7z5yYlroW4zGH4PDPq7VrwQEZGgpMAsEgSycgv4aNFW3pi/iW17s4iuG87tZ3VlzIkxRNf13aSXkw4/vAKLXof0BG/76nOeguOvgAjNkRURkeClwCxSgyWn5/Dm/M28/9NW9mbk0j+mIXef250R3ZsfuJEvYTUsexcWTYSCXIgZDCNfhC5na/tqERERFJhFaqSs3AImzt3IhDkbSfPdyHfdsI6c2D7aa+AcbFkAc5+D9dO8Y90vgJOuh3ZDAle4HJE2bdoEugQRkaCgwCxSg2Tk5PPJku28NCOe5IxcBnaI5r7ze9KjlW8TEedg6Tvww8ve/OSIKDj1Tuj3B2hwXGCLlyMWGak55SIilUGBWaQGyMot4NXZG/jvj1tISs/lxHbRvPrHrpzQzjeinJUCs56EdZMhZTM0au9Nu+g1SvOTq7HU1FQAoqK0q6KISEVSYBapxjJy8nnvhy1MmLOR5IzcX6denNCukbeGcupO+PE1b7MRgJZ94eLXvaAcEhLY4uWY7dq1C1BgFhGpaArMItVQZm4+//1hKxPmbiQxLYeTOzXhH6d3JrZdtDftYucymPc8xH0LrhA6nwUn/Bk6n6kb+URERI6QArNINVJQ6Phw0VZemhFPQmoOJ7RrxLgx/RjYoTEU5MG6qTDtPm9+cq1I7ya+E/4C0e0DXbqIiEi1pcAsUg3k5Bfw2dIdjJsRz8792cS2bcSLl/mCcnoiLHrDm3axbys0iIHhd3tBOTI60KWLiIhUewrMIlVYYaHj21W7eHLKWrbtzaJ36wbcN7InZ/VsjqXtgi/+Br98DnmZ0KwHXDwRelwIYeGBLl1ERKTGUGAWqYKcc3wft4fnp8Xz8479dG5Wj9evjGVEl8aExE2Cd9+GjbO8xr1HweCx0KK35icHmbZt2wa6BBGRoKDALFLFrNy+j4cmrWbxlhRaN6zDU6P68LuYXMJXvwnf/sfbtrpuUxj0dxhwFTTpHOiSJUBq164d6BJERIKCArNIFbFmVyovTF/H1F8SaBRZi3vPas+f6v5I2Krn4Ou5XqN2Q+HcZ6Db+VoWTti3bx8ADRs2DHAlIiI1mwKzSIDt3JfFc9PW8dnS7USGh/HoIGNUxDwifngLcvZDveZwyh3Q51Jo0inQ5UoVkpCQACgwi4hUNAVmkQDJzivg39+v57U5G6lFPs9228D5eVOotWyh16DTGd6ycJ1GaG6yiIhIACkwi1SyvIJCJq3YybPfraPO/njeaL6AwTlzCN20Fxq2hWH/B/3+AA1jAl2qiIiIoMAsUqlW70zl/s+W0GHXN0yoPZ+eEathH9DpdOj/J+h6LoTqn6WIiEhVop/MIpUgJSOXiZ9PpkXcu7wRuoCoWhm4iMbQ4xoYdCM07hjoEkVEROQQFJhFKlBedgYLvn6bOj+/y+22msLQEAo6nQUn/AnrcrbmJssxad9eW56LiFQGBWaR8lZYgNs0l52z/0PjrZMZRi67w1qS3O8WGp/8Z0Iatgl0hVJDhIdrR0cRkcqgwCxSXlI2w9J3yVn2IRHp22nmQplXayD1B17FgOEXY5qbLOVs7969AERHRwe4EhGRmk0/wUWOVl4W7FgKG2ZC3LewZzUAqws78VnoTXQ77XIuHdKNsFBtMCIVIzExEVBgFhGpaArMIkciPRE2zID10yFuCuSmAbC1Vkf+mzeGebUGMnLEUP41sC11I/TPS0REpCbQT3T5lZn1AMYBg/AWO5sIPOicKwhoYYGUuRd2r4RNcw8aRSasDhmtBvFR7hAmbG5GdkgLrhrejvcGtaNRXc0rFRERqUkUmAUAM2sETAdWAxcCHYFngRDgngCWVrkykmDHEtg8D9bPgMQ14ArBQiBmEO60e/kltDvj4hvzXVwytcNCuXJYW24e0ZnIcP1zEhERqYn0E16KXA/UAS52zqUC08wsCnjAzJ7yHas5nIO9GyEpHrb/BHvWwrYfIDPZez4kDFr0gZNvhZhBJEZ145M1OXz80zY2JWUQXTeNa0/pwF9O7kDT+hGBfS8iIiJSoRSYpcg5wNRiwfhD4ElgGDApIFUdC+cgPQGS10PSOkj4BVK2eMf2boTc9ANto1pDzCBoPQBa9sG1HUJcch6z4hKZOWMPi7aswDk4qX00Nw7vxHl9WlK7Vmjg3psI0LGjNrwREakMCsxSpBsw0/+Ac26rmWX6nqu6gbkgH/ZvhX3bIHEtJKyCXSsgYTUU5h1oFxoBTbtC7QbQd4z3efOepNdtw5r0uuxIyWJTUgbL5+xj9a75JKblANCjZRR/O7UjF/RtTdcW9QP0JkV+KyxM38JFRCqDvttKkUZ4N/oVl+J7rtKsmj+JvIx9UFiAuQIo9B6hBZmE5aYRUpBNZMZ2IjO3UydrNxE5yYQW5v56fk6tKPbX78L+1r9jX72OZEQ0Z3tEZzbmRZORW0BOfgFpifkkbcphT1oWu1NX4tyB1+/SvB5DOzchtm00Qzs3oU10ZGW+fZEyS072phA1btw4wJWIiNRsCsziz5VwzA5xHDO7FrgWICYmptyKqDvjbtoXbim1TYJrSFxhKxLowB7Xnw2uFTtcE9YXtiYhuxGkFd9yOpU6tTKIqhNGRFgodSPCaFIvnM7N6xMTHUnv1g1oE12H4xpFaqqFVBtJSUmAArOISEVTYJYiKUDDEo43oOSRZ5xzE4AJALGxsSWG6qMRMvpN1ufmQEgoWCghoaFgYRBai8I6jSGsNhZitABamWEGhu+jgZlhQIjvuRAz6kWEUbtWCGbFg7SIiIhI6RSYpchavLnKvzKzNkBd33OVpm23AZX5ciIiIiKl0p69UmQycJaZ+d/VdimQBcwOTEkiIiIigafALEVeBXKAz8zsdN/85AeA52rcGswiIiIiR0BTMgQA51yKmY0AxuMtIbcPeB4vNItIFdS5c+dAlyAiEhQUmOVXzrnVwGmBrkNEyiYkRH8kFBGpDPpuKyJSTSUmJpKYmBjoMkREajwFZhGRamrv3r3s3bs30GWIiNR4CswiIiIiIqVQYBYRERERKYUCs4iIiIhIKRSYRURERERKYc65QNcgNYCZJQJbyvGSTYCkcrxedaf+OEB9cTD1x8HUHweoLw5W3v3R1jnXtByvJ1WYArNUSWa22DkXG+g6qgr1xwHqi4OpPw6m/jhAfXEw9YccC03JEBEREREphQKziIiIiEgpFJilqpoQ6AKqGPXHAeqLg6k/Dqb+OEB9cTD1hxw1zWEWERERESmFRphFREREREqhwCxVhpn1MLMZZpZpZjvN7CEzCw10XcfCzDqZ2WtmtsLMCsxsVgltzMzuMrNtZpZlZnPM7PgS2h22f8p6rUAws0vM7Csz22Fm6Wa2xMzGFGsTFH0BYGajzGyBmSWbWbaZxZnZPWYW7tcmaPrDn5m19v0/4sysnt/xoOgPM7vK996LP673axMUfVHEzMLM7F9mFm9mOWa23cyeL9YmqPpEKplzTg89Av4AGgE7genAGcD1QAbwSKBrO8b3dSGwDfgfsAaYVUKbO4Es4EbgdOBbvLVCWxxp/5TlWgHsi4XA+8Bo4DTgGcABNwVbX/jquw54FLgIGA78n6/e8cHYH8VqfR/Y7fv/o16w9Qdwle+9DwcG+j2aBVtf+NX4ru+9XAcMA/4APHak76Mm9YkelfsIeAF66OHcr9+cUoAov2N3AJn+x6rbAwjx+/wTigVmoDawH7jP71hdINH/G3hZ+qes1wpgXzQp4dj7wKZg64tS+uhRYB9gwdofwFBgL/BP/AJzMPUHBwJzvUM8HzR94avnbCAP6FFKm6DqEz0q/6EpGVJVnANMdc6l+h37EKiDN5pQLTnnCg/TZDAQBXzsd04GMAmvT4qUpX/Keq2AcM6VtMPWMqCZ7/Og6YtSJANFUzKCrj98fxYfBzzEb3dkC7r+KEWw9cU1wEzn3OpS2gRbn0glU2CWqqIbsNb/gHNuK95v/d0CUlHl6AYUAPHFjq/h4Pddlv4p67WqksFA0Q/BoOwLMws1s0gzOxkYC7zinHMEZ39cjze69+8SngvG/thgZvm++e3X+R0Ptr44CVhnZuPNLNU39/gzM2vl1ybY+kQqmQKzVBWN8P4UXVyK77maqhGQ7pwrKHY8BYj0uwGsLP1T1mtVCWY2Am+Od1E4Cta+yPA95gKzgdt9x4OqP8ysMfAwcKtzLq+EJsHUH7uAe4E/AiOBH4FXzewW3/PB1BcALfCmqRwPXAZcDQwAPjcz87UJtj6RShYW6AJE/JS0KLgd4nhNcqj3Xfy5svRPWa8VUGbWDm/+8pfOubf8ngq6vsAbZY8ETgTuA8YDf/M9F0z98Sjwo3Pu21LaBEV/OOemAlP9Dk02swjgHjN7sahZCafWuL7wMd/jQudcMoCZ7cL7BfM0YIavXTD1iVQyBWapKlKAhiUcb0DJowE1RQpQ38xCi41mNAQy/UbaytI/Zb1WQJlZNDAZ2Ip3p3uRoOsLAOfcUt+n88wsCXjbzJ4liPrDzHrizVM9xcyK3kuk72MDMysgiPrjED7BW2GmHcHXFynAxqKw7DMPyAV64AXmYOsTqWSakiFVxVqKzQ0zszZ4dyavLfGMmmEtEAp0Kna8+Dy7svRPWa8VMGYWCXyNd2Pbeb4baYoEVV8cQlF4bk9w9UdnoBbe0oMpvkfRVJ3teDcCBlN/lMYRfH2x5hDHDSi6sTrY+kQqmQKzVBWTgbPMrL7fsUvx1sGcHZiSKsUCIBW4pOiAL1SOxOuTImXpn7JeKyDMLAxvPerOwDnOuT3FmgRNX5RiiO/jJoKrP+bhrTns/3jS99y5wNMEV3+U5Pd4K4dsIfj64mugj5k18Tt2Ct4vWSt8Xwdbn0hlC/S6dnro4dyvi8nvAqbhLRJ/LZBONV/zEu/PyqN8j4XAL35fR/ra3Il3h/bfgRHAN3g/GJsfaf+U5VoB7IsJeKNjYzl4M4aBQEQw9YWvvil4aw2fA5wJPOh7Hx8eyXuoKf1RQv9cRckbl9T4/gA+xdvI5hzgfLxNOxy/3eSnxveFr74ovClcC/FC6+V4G0JNO9L3UVP6RI/KfwS8AD30KHrgzUWbifeb/i68O+ZDA13XMb6ndr4fdCU92vnaGHA33p+es/BWS+h3NP1T1msFqC82qy8Oqu9hYJXvh/U+vOkYNwG1jvQ91IT+KOE9XcVvA3NQ9AfwGBCHF9iygCXAH4+m/ureF341dsLbbS8Db8rOW0CjYO4TPSr3Yb7/MUREREREpASawywiIiIiUgoFZhERERGRUigwi4iIiIiUQoFZRERERKQUCswiIiIiIqVQYBYRERERKYUCs4jIYZiZK8PjVDPbbGbPBLreImZ2h5mdGug6RESqO63DLCJyGGY20O/LOnibHjyCt/tXkdVARyDZObe1Ess7JDNLAsY75x4IdC0iItVZWKALEBGp6pxzPxR9bmb1fJ9u8D/us6zyqhIRkcqiKRkiIuWk+JQMM3vLzBab2XlmttrMMs3sGzOLNrNOZva9mWX42vQpdq0QM/uXma03sxwzW2dmfyrW5mQzm2tmqb7HcjO7pKgWoDFwv/+0kSO49iwz+8TMrvW9ryxf7a2LtbvTd51sM0swsylm1qI8+1VEJNA0wiwiUrFigIeAe4BIYBwwAWgHvA48BTwOfGhmPd2BeXLjgD/5zl0KnAH8x8ySnXNfm1kU8DXwpa+NAb2Bhr7zLwK+Bz4BJvqOrS7Ltf1qHwR0BW4FagNPAl8AJwCY2ZXAXcD/Ab/gBfTTgLpH3VsiIlWQArOISMWKBgY55zYA+EaSbwf+5Jx7x3fM8OZDdwPWmFkn4Abgaufc277rTDezlsD9eEG5C9AAuNE5l+Zr813RizrnlplZPrC92JSSsly7SDNgsHNui+/cLcA8MzvbOTcFOBH4zjn3st85nx11T4mIVFGakiEiUrE2F4Vln/W+jzNLOFY03WEEUAh8bmZhRQ9gBnC8mYUCG4B04H0zu9DMGlI2Zbl2kaVFYRnAOTcf2IMXlAGWA+ea2YNmdmKxc0VEagwFZhGRirWv2Ne5JRwvOlbb97EJEArsB/L8Hm/h/WWwpXMuBTgTqAV8DCT65hh3OEw9h722X9s9JZy/x6/Nf/CmZIwGfgQSzOxhBWcRqWk0JUNEpOrZC+QDQ/BGg4vbA+CcWwicbWZ1gNOB54D3gYElnHNE1/ZpVsLzzYBdvtcvBJ4HnjezNsAVwKPADuDVUmoQEalWFJhFRKqemXijwA2cc9MO19g5lwVMMrNewJ1+T+VyYNT6aK7d38xiitaVNrMheIH5pxJq2AY8YWZXAz0OV7OISHWiwCwiUsU45+LM7FW8lTOeAhbjBd+eQBfn3F/M7DzgGrxVK7bizX++joPnRq8FzjOzKXjznePKcm2/8/cAX5vZAxxYJWOp74Y/zOw1vBHrH/CmeAwHOuOtmiEiUmMoMIuIVE1/B9YBf8Vb/i0Vb1m4N3zPrwcc8BjeqG8i3goXd/ld43bg33grcETiBdpZZbh2kYXAdOAFoKnv3GuLPf9XvKBe21fTX51zXxz92xYRqXq0NbaIiPyGmc0CkpxzowJdi4hIoGmVDBERERGRUigwi4iIiIiUQlMyRERERERKoRFmEREREZFSKDCLiIiIiJRCgVlEREREpBQKzCIiIiIipVBgFhEREREphQKziIiIiEgp/h/V0+9w8sy2TAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 576x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"plot_cumulative_reward_comparison(dataq, data_qplus)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "74108cc11abe9d0edcfd58957ecd5cf1",
"grade": false,
"grade_id": "cell-3b4406fd8796da4e",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"What do you observe? (For reference, your graph should look like [Figure 8.5 in Chapter 8](http://www.incompleteideas.net/book/RLbook2018.pdf#page=189) of the RL textbook)\n",
"\n",
"The slope of the curve increases for the Dyna-Q+ curve shortly after the shortcut opens up after 3000 steps, which indicates that the rate of receiving the positive reward increases. This implies that the Dyna-Q+ agent finds the shorter path to the goal.\n",
"\n",
"To verify this, let us plot the state-visitations of the Dyna-Q+ agent before and after the shortcut opens up."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "02a92b5dfca164799531bfbfc51b2947",
"grade": false,
"grade_id": "cell-30b40e125c10f4a1",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x360 with 3 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ---------------\n",
"# Discussion Cell\n",
"# ---------------\n",
"\n",
"plot_state_visitations(data_qplus, ['Dyna-Q+ : State visitations before the env changes', 'Dyna-Q+ : State visitations after the env changes'], 0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "93e6b7711fe3bbb622a649369171566d",
"grade": false,
"grade_id": "cell-c2e1a4549783e5d9",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"What do you observe?\n",
"\n",
"Before the shortcut opens up, like Dyna-Q, the Dyna-Q+ agent finds the sole, long path to the goal. But because the Dyna-Q+ agent keeps exploring, it succeeds in discovering the shortcut once it opens up, which leads to the goal faster. So the bonus reward heuristic is effective in helping the agent explore and find changes in the environment without degrading the performance. "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "316c6bb4a3a11821d48d0c4482b546b4",
"grade": false,
"grade_id": "cell-122b7fbe5a69ce76",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"## Wrapping Up\n",
"\n",
"Congratulations! You have:\n",
"\n",
"1. implemented Dyna-Q, a model-based approach to RL;\n",
"2. implemented Dyna-Q+, a variant of Dyna-Q with an exploration bonus that encourages exploration; \n",
"3. conducted scientific experiments to empirically validate the exploration/exploitation dilemma in the planning context on an environment that changes with time."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "af62c782e534d54888e892bb8588ad60",
"grade": false,
"grade_id": "cell-38d472ccebc0dd45",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Some points to ponder about:\n",
"1. At what cost does Dyna-Q+ improve over Dyna-Q?\n",
"2. In general, what is the trade-off of using model-based methods like Dyna-Q over model-free methods like Q-learning?\n"
]
}
],
"metadata": {
"coursera": {
"course_slug": "sample-based-learning-methods",
"graded_item_id": "trR7Z",
"launcher_item_id": "edrCE"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment