Skip to content

Instantly share code, notes, and snippets.

@mvcisback
Last active April 15, 2020 04:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mvcisback/f62056f5257691160f3a0e99c6539394 to your computer and use it in GitHub Desktop.
Save mvcisback/f62056f5257691160f3a0e99c6539394 to your computer and use it in GitHub Desktop.
CAV2020 Artifact Evaluation for Learning from Demos
# Start from a core stack version
FROM jupyter/scipy-notebook:dc9744740e12
USER root
RUN apt-get update && apt-get install -y \
zlib1g-dev \
graphviz
USER jovyan
# Install from requirements.txt file
COPY requirements.txt /tmp/
COPY experiment.ipynb /home/jovyan/
RUN pip install wheel
RUN pip install --requirement /tmp/requirements.txt
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CAV 2020 Artifact\n",
"\n",
"- paper: **Learning Task Specifications from Demonstrations via the Principle of Maximum Causal Entropy**\n",
"- authors: Marcell Vazquez-Chanlatte, Sanjit A. Seshia\n",
"\n",
"\n",
"This notebook is designed to be run sequentially; however the majority of the notebook is setup for the final section on actually performing inference. As such, we have provided the following TOC to get an overview of the notebook.\n",
"\n",
"## TOC:\n",
"1. [Dynamical System](#1.-Create-Dynamical-System)\n",
"2. [Sensor and Visualizing the Map](#2.-Create-Sensor-+-Visualize-Map)\n",
"3. [Demonstrations](#3.-Describe-demonstrations)\n",
"4. [Concept Class](#4.-Define-Specification-Circuits)\n",
"5. [Specification Inference](#5.-Run-Maximum-Casual-Entropy-Specification-Inference)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import aiger as A\n",
"import aiger_bv as BV\n",
"import aiger_coins as C\n",
"import aiger_gridworld as GW\n",
"import aiger_ptltl as LTL\n",
"import funcy as fn\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"from bidict import bidict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1. Create Dynamical System\n",
"\n",
"Here we create a BitVector sequential circuit, `DYN`, using `py-aiger`, the models a gridworld (line 4).\n",
"\n",
"Afterwords, lines 6-8 describe introducing a slip probability of `1/32` (modeled by a biased coin with bias `31/32`). \n",
"\n",
"**Note that states are 1-hot encoded**"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"X = BV.atom(8, 'x', signed=False)\n",
"Y = BV.atom(8, 'y', signed=False)\n",
"\n",
"DYN = GW.gridworld(8, start=(3, 5), compressed_inputs=True)\n",
"\n",
"SLIP = BV.atom(1, 'c', signed=False).repeat(2) & BV.atom(2, 'a', signed=False)\n",
"SLIP = SLIP.with_output('a').aigbv\n",
"DYN2 = C.coin((31, 32), 'c') >> C.circ2mdp(DYN << SLIP)\n",
"\n",
"def encode_state(x, y):\n",
" x, y = [BV.encode_int(8, 1 << (v - 1), signed=False) for v in (x, y)]\n",
" return {'x': tuple(x), 'y': tuple(y)}"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'x': (False, True, False, False, False, False, False, False),\n",
" 'y': (False, False, True, False, False, False, False, False)}"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"encode_state(2, 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Create Sensor / Feature overlay\n",
"\n",
"Next, we define the mapping from concrete states to sensor values / atomic predicates.\n",
"We use simple coordinate wise bitvector masks to encode the color overlays."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def mask_test(xmask, ymask):\n",
" return ((X & xmask) !=0) & ((Y & ymask) != 0)\n",
"\n",
"\n",
"APS = { # x-axis y-axis\n",
" 'yellow': mask_test(0b1000_0001, 0b1000_0001),\n",
" 'blue': mask_test(0b0001_1000, 0b0011100),\n",
" 'brown': mask_test(0b0011_1100, 0b1000_0001),\n",
" 'red': mask_test(0b1000_0001, 0b0100_1100) \\\n",
" | mask_test(0b0100_0010, 0b1100_1100),\n",
"}\n",
"\n",
"def create_sensor(aps):\n",
" sensor = BV.aig2aigbv(A.empty())\n",
" for name, ap in APS.items():\n",
" sensor |= ap.with_output(name).aigbv\n",
" return sensor\n",
"\n",
"\n",
"SENSOR = create_sensor(APS)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualizing Overlay\n",
"\n",
"This can all seem pretty abstract, so let's visualize the way the sensor sees the board."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ff5454'>&nbsp;&nbsp;&nbsp;&nbsp;</text>&nbsp;<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import HTML as html_print\n",
"\n",
"\n",
"COLOR_ALIAS = {\n",
" 'yellow': '#ffff8c', 'brown': '#ffb081',\n",
" 'red': '#ff5454', 'blue': '#9595ff'\n",
"}\n",
"\n",
"\n",
"def tile(color='black'):\n",
" color = COLOR_ALIAS.get(color, color)\n",
" s = '&nbsp;'*4\n",
" return f\"<text style='border: solid 1px;background-color:{color}'>{s}</text>\"\n",
"\n",
"\n",
"def ap_at_state(x, y, in_ascii=False):\n",
" \"\"\"Use sensor to create colored tile.\"\"\"\n",
" state = encode_state(x, y)\n",
" obs = SENSOR(state)[0] # <---------- \n",
"\n",
" for k in COLOR_ALIAS.keys():\n",
" if obs[k][0]:\n",
" return tile(k)\n",
" return tile('white')\n",
"\n",
"def print_map():\n",
" \"\"\"Scan the board row by row and print colored tiles.\"\"\"\n",
" order = range(1, 9)\n",
" for y in order:\n",
" chars = (ap_at_state(x, y, in_ascii=True) for x in order)\n",
" display(html_print('&nbsp;'.join(chars)))\n",
" \n",
"print_map()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3. Describe demonstrations\n",
"\n",
"We now encode a collection of demonstrations from an agent attempting to:\n",
"\n",
"1. Avoid the red tiles (lava).\n",
"2. Reach the yellow tiles (recharge).\n",
"3. If the agent touches a blue tile (water), then it must dry off (brown tile) before recharging.\n",
"\n",
"**Note** Trace 5 corresponds to a very unlikely demonstration where the agent enters the water, but is unable to dry off due to wind."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"trc 0:&nbsp;&nbsp;&nbsp;→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"trc 1:&nbsp;&nbsp;&nbsp;↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"trc 2:&nbsp;&nbsp;&nbsp;←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"trc 3:&nbsp;&nbsp;&nbsp;↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"trc 4:&nbsp;&nbsp;&nbsp;↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffb081'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"trc 5:&nbsp;&nbsp;&nbsp;↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def print_trc(trc, idx=0):\n",
" actions, states = trc\n",
" obs = (ap_at_state(*pos, in_ascii=True) for pos in states)\n",
" display(\n",
" html_print(f'trc {idx}:&nbsp;&nbsp;&nbsp;' + ''.join(''.join(x) for x in zip(actions, obs)) + '\\n')\n",
" )\n",
"\n",
"ACTIONS0 = \"→→↑↑↑↑→→→\"\n",
"STATES0 = ((4, 5), (5, 5), (5, 4), (5, 3),(5, 2), (5, 1), (6, 1), (7, 1), (8, 1))\n",
"TRC0 = (ACTIONS0, STATES0)\n",
"print_trc(TRC0, 0)\n",
"\n",
"ACTIONS1 = \"↑↑↑↑←←←←←\"\n",
"STATES1 = ((3, 4), (3, 3), (3, 2), (3, 1), (2, 1), (1, 1), (1, 1), (1, 1), (1, 1),)\n",
"TRC1 = (ACTIONS1, STATES1)\n",
"print_trc(TRC1, 1)\n",
"\n",
"ACTIONS2 = \"←→↑↑↑←↑←←\"\n",
"STATES2 = ((2, 5), (3, 5), (3, 4), (3, 3), (3, 2), (2, 2), (2, 1), (1, 1), (1, 1))\n",
"TRC2 = (ACTIONS2, STATES2)\n",
"print_trc(TRC2, 2)\n",
"\n",
"ACTIONS3 = \"↑↑→←↑↑←←←\"\n",
"STATES3 = ((3, 4), (3, 3), (4, 3), (3, 3), (3, 2), (3, 1), (2, 1), (1, 1), (1, 1))\n",
"TRC3 = (ACTIONS3, STATES3)\n",
"print_trc(TRC3, 3)\n",
"\n",
"ACTIONS4 = \"↑→↑↑↑←←←←\"\n",
"STATES4 = ((3, 4), (4, 4), (4, 3), (4, 2), (4, 1), (3, 1), (2, 1), (1, 1), (1, 1))\n",
"TRC4 = (ACTIONS4, STATES4)\n",
"print_trc(TRC4, 4)\n",
"\n",
"ACTIONS5 = \"↑→↑↑←←←↑↑\"\n",
"STATES5 = ((3, 4), (4, 4), (4, 3), (4, 2), (3, 2), (2, 2), (1, 2), (1, 1), (1, 1))\n",
"TRC5 = (ACTIONS5, STATES5)\n",
"print_trc(TRC5, 5)\n",
"\n",
"TRACES = [TRC0, TRC1, TRC2, TRC3, TRC4] # Variety of positive demos.\n",
"TRACES += [TRC5] # Unlucky, Negative Demonstration.\n",
"TRACES += 4 * [TRC4] # Additional \"Safe\" Demonstrations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Convert \"human readable\" demos to bitvectors.\n",
"\n",
"We need to put the traces in a form that the dynamcs understands (namely Tuples of `bool`s (or `0, 1`).\n",
"\n",
"As before, the positions, `x, y` are 1-hot encoded. The actions bitvectors are simply mapped to the corresponding action in the `aiger_gridworld` module."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"([{'a': (1, 0)},\n",
" {'a': (1, 0)},\n",
" {'a': (0, 1)},\n",
" {'a': (0, 1)},\n",
" {'a': (0, 1)},\n",
" {'a': (0, 1)},\n",
" {'a': (1, 0)},\n",
" {'a': (1, 0)},\n",
" {'a': (1, 0)}],\n",
" [{'x': (False, False, False, True, False, False, False, False),\n",
" 'y': (False, False, False, False, True, False, False, False)},\n",
" {'x': (False, False, False, False, True, False, False, False),\n",
" 'y': (False, False, False, False, True, False, False, False)},\n",
" {'x': (False, False, False, False, True, False, False, False),\n",
" 'y': (False, False, False, True, False, False, False, False)},\n",
" {'x': (False, False, False, False, True, False, False, False),\n",
" 'y': (False, False, True, False, False, False, False, False)},\n",
" {'x': (False, False, False, False, True, False, False, False),\n",
" 'y': (False, True, False, False, False, False, False, False)},\n",
" {'x': (False, False, False, False, True, False, False, False),\n",
" 'y': (True, False, False, False, False, False, False, False)},\n",
" {'x': (False, False, False, False, False, True, False, False),\n",
" 'y': (True, False, False, False, False, False, False, False)},\n",
" {'x': (False, False, False, False, False, False, True, False),\n",
" 'y': (True, False, False, False, False, False, False, False)},\n",
" {'x': (False, False, False, False, False, False, False, True),\n",
" 'y': (True, False, False, False, False, False, False, False)}])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ACTION2ARROW = bidict({\n",
" GW.NORTH_C: '↑',\n",
" GW.SOUTH_C: '↓',\n",
" GW.WEST_C: '←',\n",
" GW.EAST_C: '→',\n",
"})\n",
"\n",
"def str2actions(vals):\n",
" return [ACTION2ARROW.inv[c] for c in vals]\n",
"\n",
"def encode_trace(trc):\n",
" actions, states = trc\n",
" actions = str2actions(actions)\n",
" actions = [{'a': a} for a in actions]\n",
" states = [encode_state(*s) for s in states]\n",
" return actions, states\n",
"\n",
"encode_trace(TRC0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4. Define Specification Circuits / Concept Class\n",
"\n",
"First, we describe the properties over colors of the map. This is done in past tense temporal logic using `py-aiger-ptltl`."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"LAVA, RECHARGE, WATER, DRY = map(LTL.atom, ['red', 'yellow', 'blue', 'brown'])\n",
"\n",
"EVENTUALLY_RECHARGE = RECHARGE.once()\n",
"AVOID_LAVA = (~LAVA).historically()\n",
"\n",
"RECHARGED_AND_ONCE_WET = RECHARGE & WATER.once()\n",
"DRIED_OFF = (~WATER).since(DRY)\n",
"\n",
"DIDNT_RECHARGE_WHILE_WET = (RECHARGED_AND_ONCE_WET).implies(DRIED_OFF)\n",
"DONT_RECHARGE_WHILE_WET = DIDNT_RECHARGE_WHILE_WET.historically()\n",
"\n",
"CONST_TRUE = LTL.atom(True)\n",
"\n",
"\n",
"SPECS = [\n",
" CONST_TRUE, AVOID_LAVA, EVENTUALLY_RECHARGE, DONT_RECHARGE_WHILE_WET,\n",
" AVOID_LAVA & EVENTUALLY_RECHARGE & DONT_RECHARGE_WHILE_WET,\n",
" AVOID_LAVA & EVENTUALLY_RECHARGE,\n",
" AVOID_LAVA & DONT_RECHARGE_WHILE_WET,\n",
" EVENTUALLY_RECHARGE & DONT_RECHARGE_WHILE_WET,\n",
"]\n",
"\n",
"SPEC_NAMES = [\n",
" \"CONST_TRUE\", \"AVOID_LAVA\", \"EVENTUALLY_RECHARGE\", \"DONT_RECHARGE_WHILE_WET\",\n",
" \"AVOID_LAVA & EVENTUALLY_RECHARGE & DONT_RECHARGE_WHILE_WET\",\n",
" \"AVOID_LAVA & EVENTUALLY_RECHARGE\",\n",
" \"AVOID_LAVA & DONT_RECHARGE_WHILE_WET\",\n",
" \"EVENTUALLY_RECHARGE & DONT_RECHARGE_WHILE_WET\",\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Convert specifications to monitor circuits\n",
"\n",
"Next we preprend the sensor circuit to make these specifications over sequences of positions. The `py-aiger` ecosystem makes this fairly painless."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"def spec2monitor(spec):\n",
" monitor = spec.aig | A.sink(['red', 'yellow', 'brown', 'blue'])\n",
" monitor = BV.aig2aigbv(monitor)\n",
" return SENSOR >> monitor\n",
" \n",
"SPEC2MONITORS = { spec: spec2monitor(spec) for spec in SPECS }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Run Maximum Casual Entropy Specification Inference\n",
"\n",
"Finally, we use our maximum causal entropy inference algorithm which computes a BDD representation of the composition of each spec and MDP circuit.\n",
"\n",
"**Note:** Several performance bugs were fixed between this version and the submitted version, resulting in improved BDD construction times.\n",
"\n",
"Package `mce-spec-inference` package available here: https://pypi.org/project/mce-spec-inference/"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"encoding traces\n",
"done encoding traces\n",
"concretizing spec\n",
"concretizing spec\n",
"concretizing spec\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1\n",
"Controller Size: 0\n",
"log_prob: -436.68272375276655\n",
"build spec: 1.0s\n",
"fit: 0.0037s\n",
"surprise: 0.018s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.0323832035064697, 'fit': 0.003675222396850586, 'surprise': 0.018192052841186523}\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done spec\n",
"done spec\n",
"fitting policy\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1628\n",
"Controller Size: 328\n",
"log_prob: -416.140363638969\n",
"build spec: 1.2s\n",
"fit: 0.75s\n",
"surprise: 0.016s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.226585865020752, 'fit': 0.7500314712524414, 'surprise': 0.01633596420288086}\n",
"concretizing spec\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1797\n",
"Controller Size: 348\n",
"log_prob: -464.52300675667044\n",
"build spec: 1.6s\n",
"fit: 0.84s\n",
"surprise: 0.017s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.5723936557769775, 'fit': 0.8362810611724854, 'surprise': 0.01737356185913086}\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 523\n",
"Controller Size: 118\n",
"log_prob: -406.3010404427357\n",
"build spec: 1.7s\n",
"fit: 0.25s\n",
"surprise: 0.018s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.6993613243103027, 'fit': 0.24697279930114746, 'surprise': 0.018005847930908203}\n",
"concretizing spec\n",
"done spec\n",
"done fitting\n",
"compute log likelihood of demos\n",
"fitting policy\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 850\n",
"Controller Size: 195\n",
"log_prob: -468.3600156772926\n",
"build spec: 1.8s\n",
"fit: 3.1s\n",
"surprise: 0.018s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.807154893875122, 'fit': 3.123621940612793, 'surprise': 0.018232107162475586}\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 577\n",
"Controller Size: 132\n",
"log_prob: -384.2062931371331\n",
"build spec: 2.3s\n",
"fit: 2.1s\n",
"surprise: 0.018s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 2.2858481407165527, 'fit': 2.1305387020111084, 'surprise': 0.017897367477416992}\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1913\n",
"Controller Size: 374\n",
"log_prob: -434.06910537400285\n",
"build spec: 2.2s\n",
"fit: 6.7s\n",
"surprise: 0.021s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 2.1837480068206787, 'fit': 6.7344560623168945, 'surprise': 0.021210193634033203}\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1842\n",
"Controller Size: 372\n",
"log_prob: -407.4691572492924\n",
"build spec: 1.8s\n",
"fit: 6.3s\n",
"surprise: 0.02s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.7971274852752686, 'fit': 6.251629590988159, 'surprise': 0.019525766372680664}\n"
]
}
],
"source": [
"from mce.infer import spec_mle\n",
"\n",
"\n",
"def evaluate_demos(trcs, dynamics=DYN2):\n",
" demos = [encode_trace(trc) for trc in trcs]\n",
" best, spec2score = spec_mle(\n",
" dynamics, demos, SPEC2MONITORS.values(), parallel=True\n",
" )\n",
" \n",
" def normalize(score):\n",
" return score - spec2score[SPEC2MONITORS[CONST_TRUE]]\n",
"\n",
" return fn.lmap(normalize, spec2score.values())\n",
" \n",
"scores1 = evaluate_demos(TRACES)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plotting Results\n",
"\n",
"**Remark**: Based on the reviews, we have slightly changed how we present our likelihood results. Namely:\n",
"\n",
"1. Because of the sheer number of possible traces, the log likelihoods are typically **very** small.\n",
"2. As a base line, we compare against the likelihood of generating the demonstrations under uniformly random actions.\n",
"3. **Note** that this is equivilent to the likelihood of the demos under the `CONST_TRUE` specification.\n"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def plot_scores(scores):\n",
" sns.barplot(y=SPEC_NAMES, x =scores, orient='h')\n",
" plt.title('Relative likelihoods for each specification')\n",
" plt.xlabel('log[P(demos | spec)] - log[P(demos | TRUE)]')\n",
" plt.show()\n",
"\n",
"plot_scores(scores1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hopefully, you can see that the most likely demonstration is the intended one, namely,\n",
"\n",
"\n",
"1. Avoid the red tiles (lava).\n",
"2. Reach the yellow tiles (recharge).\n",
"3. If the agent touches a blue tile (water), then it must dry off (brown tile) before recharging.\n",
"\n",
"This is despite there being an unlabeled negative demonstration (TRC5) which fails to recharge."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Another set of demonstrations\n",
"\n",
"Now consider the situation where TRC5 is given twice in a row, where we recall that TRC5 fails to dry off before recharding."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"trc 0:&nbsp;&nbsp;&nbsp;↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>→<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#9595ff'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>←<text style='border: solid 1px;background-color:white'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>↑<text style='border: solid 1px;background-color:#ffff8c'>&nbsp;&nbsp;&nbsp;&nbsp;</text>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"print_trc(TRC5)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"encoding traces\n",
"done encoding traces\n",
"concretizing spec\n",
"concretizing spec\n",
"concretizing spec\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1\n",
"Controller Size: 0\n",
"log_prob: -87.33654475055322\n",
"build spec: 1.3s\n",
"fit: 0.0033s\n",
"surprise: 0.0052s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.2919392585754395, 'fit': 0.003260374069213867, 'surprise': 0.0052225589752197266}\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done spec\n",
"fitting policy\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1628\n",
"Controller Size: 328\n",
"log_prob: -82.49648318493705\n",
"build spec: 1.5s\n",
"fit: 0.74s\n",
"surprise: 0.0048s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.4929251670837402, 'fit': 0.7406604290008545, 'surprise': 0.0047533512115478516}\n",
"concretizing spec\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 850\n",
"Controller Size: 195\n",
"log_prob: -89.77841117137949\n",
"build spec: 2.0s\n",
"fit: 0.39s\n",
"surprise: 0.0042s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.9750003814697266, 'fit': 0.3940000534057617, 'surprise': 0.0042362213134765625}\n",
"concretizing spec\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1797\n",
"Controller Size: 348\n",
"log_prob: -92.81852146235173\n",
"build spec: 1.7s\n",
"fit: 0.81s\n",
"surprise: 0.0041s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.7426140308380127, 'fit': 0.8115346431732178, 'surprise': 0.004128456115722656}\n",
"concretizing spec\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 577\n",
"Controller Size: 132\n",
"log_prob: -87.32193989773135\n",
"build spec: 2.0s\n",
"fit: 0.27s\n",
"surprise: 0.0063s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.999612808227539, 'fit': 0.2711455821990967, 'surprise': 0.006339073181152344}\n",
"done spec\n",
"fitting policy\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 523\n",
"Controller Size: 118\n",
"log_prob: -77.1876963244796\n",
"build spec: 1.7s\n",
"fit: 0.5s\n",
"surprise: 0.0067s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.6537911891937256, 'fit': 0.49933838844299316, 'surprise': 0.006686687469482422}\n",
"done spec\n",
"fitting policy\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1842\n",
"Controller Size: 372\n",
"log_prob: -85.52927920428995\n",
"build spec: 2.0s\n",
"fit: 0.83s\n",
"surprise: 0.0097s\n",
"done fitting\n",
"compute log likelihood of demos\n",
"\n",
"----------------------------\n",
"\n",
"BDD size: 1913\n",
"\n",
"----------------------------\n",
"Controller Size: 374\n",
"\n",
"{'build spec': 2.0062806606292725, 'fit': 0.8324692249298096, 'surprise': 0.00965571403503418}\n",
"log_prob: -89.02936303916954\n",
"build spec: 1.9s\n",
"fit: 1.1s\n",
"surprise: 0.0048s\n",
"\n",
"----------------------------\n",
"\n",
"{'build spec': 1.9228413105010986, 'fit': 1.133453369140625, 'surprise': 0.004793882369995117}\n"
]
}
],
"source": [
"scores2 = evaluate_demos([TRC5, TRC5])"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plot_scores(scores2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
attrs==19.3.0
bdd2dfa==0.4.1
blessings==1.7
dd==0.5.5
funcy==1.14
lazytree==0.3.1
multiprocess==0.70.9
py-aiger==4.0.2
py-aiger-bdd==0.3.1
py-aiger-bv==2.0.0
py-aiger-cnf==3.0.0
py-aiger-coins==1.4.0
py-aiger-gridworld==0.2.0
py-aiger-ptltl==1.2.0
py-aiger-sat==1.1.0
termplotlib==0.2.4
mce-spec-inference==0.1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment