refraction-ray/collections_demo.ipynb

## collections_demo.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Demo on collections module\n",
    "**Reference:** [python doc](https://docs.python.org/3.6/library/collections.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import collections"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "heading_collapsed": true
   },
   "source": [
    "## ChainMap"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "ChainMap is a faster way to merge some dicts, compared to `dict_old.update(dict_new)`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'Bob': False, 'John': True}, {'Alice': True}]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dict1 = {\"John\": True, \"Bob\": False}\n",
    "dict2 = {\"Alice\": True}\n",
    "cm = collections.ChainMap(dict1, dict2) # this is first than merging two dicts by updates\n",
    "cm.maps"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'Bob': False, 'John': True}, {'Alice': True, 'Lucy': False}]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dict2[\"Lucy\"]=False\n",
    "cm.maps ## the edit on the input can be reflected by chainmap automatically"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'Bob': False, 'John': True, 'Mike': False}, {'Alice': True, 'Lucy': False}]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cm[\"Mike\"]=False\n",
    "cm.maps ## all the add on cm itself is reflected on the first map"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cm['Alice'] ## the search is go through the maps list, slower than one dict! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dict2['John']=False\n",
    "cm['John'] # for the same key, the previous one will override the latter one when query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hidden": true
   },
   "outputs": [],
   "source": [
    "## a deep search version of ChainMap, the ordinary chainmap only update the first dict in the map list\n",
    "## below is a hack subclass where update can also happen in other dicts\n",
    "\n",
    "class DeepChainMap(ChainMap):\n",
    "    'Variant of ChainMap that allows direct updates to inner scopes'\n",
    "\n",
    "    def __setitem__(self, key, value):\n",
    "        for mapping in self.maps:\n",
    "            if key in mapping:\n",
    "                mapping[key] = value\n",
    "                return\n",
    "        self.maps[0][key] = value\n",
    "\n",
    "    def __delitem__(self, key):\n",
    "        for mapping in self.maps:\n",
    "            if key in mapping:\n",
    "                del mapping[key]\n",
    "                return\n",
    "        raise KeyError(key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{},\n",
       " {'Bob': False, 'John': True, 'Mike': False},\n",
       " {'Alice': True, 'John': False, 'Lucy': False}]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cm_c = cm.new_child() ## support for nested context\n",
    "cm_c.maps "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ChainMap({'John': True, 'Bob': False, 'Mike': False}, {'Alice': True, 'Lucy': False, 'John': False})"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cm_c.parents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "heading_collapsed": true
   },
   "source": [
    "## Counter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "counter is ... just a counter, similar with the implmentation via dict, actually a subclass of dict which can directly count iterable input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({1: 3, 2: 2, 3: 1})"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct = collections.Counter()\n",
    "for w in [1,2,3,1,2,1]:\n",
    "    ct[w] += 1\n",
    "ct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({1: 3, 2: 2, 3: 1})"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## better way to count\n",
    "collections.Counter([1,2,3,1,2,1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({' ': 1,\n",
       "         '!': 1,\n",
       "         'd': 1,\n",
       "         'e': 1,\n",
       "         'h': 1,\n",
       "         'l': 3,\n",
       "         'o': 2,\n",
       "         'r': 1,\n",
       "         'w': 1})"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd=collections.Counter(\"hello world!\")\n",
    "wd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd[\"z\"] # for missing value, return 0 instead of rasing error"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['h', 'e', 'l', 'l', 'l', 'o', 'o', ' ', 'w', 'r', 'd', '!']"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(wd.elements()) ## same things together"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('l', 3), ('o', 2)]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd.most_common(2) ## the 2 most common char"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({' ': 1,\n",
       "         '!': 1,\n",
       "         'd': 1,\n",
       "         'e': 1,\n",
       "         'h': 1,\n",
       "         'l': 13,\n",
       "         'o': 2,\n",
       "         'r': 1,\n",
       "         'w': 1})"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd2 = {\"l\":10}\n",
    "wd.update(wd2)\n",
    "wd # the value is added instead of updated by dict.update"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Counter({'l': 10})"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd2 = collections.Counter(wd2)\n",
    "wd2 ## counter can also directly initialized with a dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Counter({' ': 1,\n",
       "          '!': 1,\n",
       "          'd': 1,\n",
       "          'e': 1,\n",
       "          'h': 1,\n",
       "          'l': 23,\n",
       "          'o': 2,\n",
       "          'r': 1,\n",
       "          'w': 1}),\n",
       " Counter({' ': 1,\n",
       "          '!': 1,\n",
       "          'd': 1,\n",
       "          'e': 1,\n",
       "          'h': 1,\n",
       "          'l': 3,\n",
       "          'o': 2,\n",
       "          'r': 1,\n",
       "          'w': 1}),\n",
       " Counter({'l': 10}),\n",
       " Counter({' ': 1,\n",
       "          '!': 1,\n",
       "          'd': 1,\n",
       "          'e': 1,\n",
       "          'h': 1,\n",
       "          'l': 13,\n",
       "          'o': 2,\n",
       "          'r': 1,\n",
       "          'w': 1}))"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wd+wd2, wd-wd2, wd&wd2, wd|wd2 ## overload for +, - (keep only positive ones), min, max"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "heading_collapsed": true
   },
   "source": [
    "## deque"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "deque is a double-ended queue, it is thread safe, and O(1) for pop and append on both side of the queue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "hidden": true
   },
   "outputs": [],
   "source": [
    "dq = collections.deque([1,2,3,4],maxlen=5) # the maxlen gives the restriction on the queue, new append must \n",
    "# come with pop from the other end when maxlen is reached "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dq.append(5)\n",
    "list(dq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dq.appendleft(0)\n",
    "list(dq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dq.count(1) # count how many 1 in the queue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[3, 4, 6, 7, 8]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dq.extend([6,7,8])\n",
    "list(dq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dq.popleft()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[4, 6, 7, 8]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(dq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "hidden": true
   },
   "outputs": [],
   "source": [
    "dq.rotate(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[7, 8, 4, 6]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(dq) ## rotate the linked list 2 steps to the right"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "heading_collapsed": true
   },
   "source": [
    "## defaultdict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "of course, defaultdict is a subclass of dict. with a new input for constructor called default_factory. \n",
    "It is designed for a default value of the dict when key is missing. Namely everytime the accessed key is missing,\n",
    "the default_factory() is called."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "defaultdict(list, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]\n",
    "d = collections.defaultdict(list)\n",
    "[d[k].append(v) for k,v in s]\n",
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## to understand the above code, just recall that\n",
    "list() ## the is the function called each time a missing key is accessed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'blue': [2, 4], 'red': [1], 'yellow': [1, 3]}"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## the usage above is equivalent to\n",
    "d = {}\n",
    "for k, v in s:\n",
    "    d.setdefault(k, []).append(v)\n",
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0, set(), ())"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "int(),set(),tuple() ## lots of interesting default factory can be used and explored!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## namedtuple"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "assign name to tuple, which is better understood compared to access by index pos"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "Point = collections.namedtuple('Point','x y') # for the field name ['x', 'y'], 'x y', 'x,y' are all legal formats\n",
    "p=Point(11,22)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(11, 22)"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p.x, p.y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(11, 22)"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p[0], p[1] ## the old fashioned way in tuple also works"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "33"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p = p._replace(x=33)  ## tuple finally can be changed\n",
    "p.x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('x', 'y')"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p._fields ## view field names"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## OrderedDict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A dict in python is implemented by hash to obtain O(1) speed with the cost that the keys cannot be sorted.\n",
    "An orderdict, well, is a dict remembering order. (order of keys that first inserted)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
    "od = collections.OrderedDict([(1,2),(3,4)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 4)"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "od[1], od[3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 4)"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "od.popitem(last=True) # pop in LIFO fashion if last is True (by default), otherwise pop in FIFO order"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "OrderedDict([(1, 2)])"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "od"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Demo on collections module\n",
	"Reference: [python doc](https://docs.python.org/3.6/library/collections.html)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"import collections"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"heading_collapsed": true
	},
	"source": [
	"## ChainMap"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"hidden": true
	},
	"source": [
	"ChainMap is a faster way to merge some dicts, compared to `dict_old.update(dict_new)`"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[{'Bob': False, 'John': True}, {'Alice': True}]"
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dict1 = {\"John\": True, \"Bob\": False}\n",
	"dict2 = {\"Alice\": True}\n",
	"cm = collections.ChainMap(dict1, dict2) # this is first than merging two dicts by updates\n",
	"cm.maps"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[{'Bob': False, 'John': True}, {'Alice': True, 'Lucy': False}]"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dict2[\"Lucy\"]=False\n",
	"cm.maps ## the edit on the input can be reflected by chainmap automatically"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[{'Bob': False, 'John': True, 'Mike': False}, {'Alice': True, 'Lucy': False}]"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"cm[\"Mike\"]=False\n",
	"cm.maps ## all the add on cm itself is reflected on the first map"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"True"
	]
	},
	"execution_count": 10,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"cm['Alice'] ## the search is go through the maps list, slower than one dict! "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"True"
	]
	},
	"execution_count": 11,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dict2['John']=False\n",
	"cm['John'] # for the same key, the previous one will override the latter one when query"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {
	"hidden": true
	},
	"outputs": [],
	"source": [
	"## a deep search version of ChainMap, the ordinary chainmap only update the first dict in the map list\n",
	"## below is a hack subclass where update can also happen in other dicts\n",
	"\n",
	"class DeepChainMap(ChainMap):\n",
	" 'Variant of ChainMap that allows direct updates to inner scopes'\n",
	"\n",
	" def __setitem__(self, key, value):\n",
	" for mapping in self.maps:\n",
	" if key in mapping:\n",
	" mapping[key] = value\n",
	" return\n",
	" self.maps[0][key] = value\n",
	"\n",
	" def __delitem__(self, key):\n",
	" for mapping in self.maps:\n",
	" if key in mapping:\n",
	" del mapping[key]\n",
	" return\n",
	" raise KeyError(key)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[{},\n",
	" {'Bob': False, 'John': True, 'Mike': False},\n",
	" {'Alice': True, 'John': False, 'Lucy': False}]"
	]
	},
	"execution_count": 13,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"cm_c = cm.new_child() ## support for nested context\n",
	"cm_c.maps "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"ChainMap({'John': True, 'Bob': False, 'Mike': False}, {'Alice': True, 'Lucy': False, 'John': False})"
	]
	},
	"execution_count": 14,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"cm_c.parents"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"heading_collapsed": true
	},
	"source": [
	"## Counter"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"hidden": true
	},
	"source": [
	"counter is ... just a counter, similar with the implmentation via dict, actually a subclass of dict which can directly count iterable input"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Counter({1: 3, 2: 2, 3: 1})"
	]
	},
	"execution_count": 16,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"ct = collections.Counter()\n",
	"for w in [1,2,3,1,2,1]:\n",
	" ct[w] += 1\n",
	"ct"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Counter({1: 3, 2: 2, 3: 1})"
	]
	},
	"execution_count": 17,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"## better way to count\n",
	"collections.Counter([1,2,3,1,2,1])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 19,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Counter({' ': 1,\n",
	" '!': 1,\n",
	" 'd': 1,\n",
	" 'e': 1,\n",
	" 'h': 1,\n",
	" 'l': 3,\n",
	" 'o': 2,\n",
	" 'r': 1,\n",
	" 'w': 1})"
	]
	},
	"execution_count": 19,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd=collections.Counter(\"hello world!\")\n",
	"wd"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 20,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"0"
	]
	},
	"execution_count": 20,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd[\"z\"] # for missing value, return 0 instead of rasing error"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 23,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"['h', 'e', 'l', 'l', 'l', 'o', 'o', ' ', 'w', 'r', 'd', '!']"
	]
	},
	"execution_count": 23,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(wd.elements()) ## same things together"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 24,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[('l', 3), ('o', 2)]"
	]
	},
	"execution_count": 24,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd.most_common(2) ## the 2 most common char"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 25,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Counter({' ': 1,\n",
	" '!': 1,\n",
	" 'd': 1,\n",
	" 'e': 1,\n",
	" 'h': 1,\n",
	" 'l': 13,\n",
	" 'o': 2,\n",
	" 'r': 1,\n",
	" 'w': 1})"
	]
	},
	"execution_count": 25,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd2 = {\"l\":10}\n",
	"wd.update(wd2)\n",
	"wd # the value is added instead of updated by dict.update"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 29,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Counter({'l': 10})"
	]
	},
	"execution_count": 29,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd2 = collections.Counter(wd2)\n",
	"wd2 ## counter can also directly initialized with a dict"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 30,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(Counter({' ': 1,\n",
	" '!': 1,\n",
	" 'd': 1,\n",
	" 'e': 1,\n",
	" 'h': 1,\n",
	" 'l': 23,\n",
	" 'o': 2,\n",
	" 'r': 1,\n",
	" 'w': 1}),\n",
	" Counter({' ': 1,\n",
	" '!': 1,\n",
	" 'd': 1,\n",
	" 'e': 1,\n",
	" 'h': 1,\n",
	" 'l': 3,\n",
	" 'o': 2,\n",
	" 'r': 1,\n",
	" 'w': 1}),\n",
	" Counter({'l': 10}),\n",
	" Counter({' ': 1,\n",
	" '!': 1,\n",
	" 'd': 1,\n",
	" 'e': 1,\n",
	" 'h': 1,\n",
	" 'l': 13,\n",
	" 'o': 2,\n",
	" 'r': 1,\n",
	" 'w': 1}))"
	]
	},
	"execution_count": 30,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wd+wd2, wd-wd2, wd&wd2, wd\|wd2 ## overload for +, - (keep only positive ones), min, max"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"heading_collapsed": true
	},
	"source": [
	"## deque"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"hidden": true
	},
	"source": [
	"deque is a double-ended queue, it is thread safe, and O(1) for pop and append on both side of the queue"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 34,
	"metadata": {
	"hidden": true
	},
	"outputs": [],
	"source": [
	"dq = collections.deque([1,2,3,4],maxlen=5) # the maxlen gives the restriction on the queue, new append must \n",
	"# come with pop from the other end when maxlen is reached "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 35,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1, 2, 3, 4, 5]"
	]
	},
	"execution_count": 35,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dq.append(5)\n",
	"list(dq)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 36,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[0, 1, 2, 3, 4]"
	]
	},
	"execution_count": 36,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dq.appendleft(0)\n",
	"list(dq)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 37,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"1"
	]
	},
	"execution_count": 37,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dq.count(1) # count how many 1 in the queue"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 38,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[3, 4, 6, 7, 8]"
	]
	},
	"execution_count": 38,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dq.extend([6,7,8])\n",
	"list(dq)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 39,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"3"
	]
	},
	"execution_count": 39,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"dq.popleft()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 40,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[4, 6, 7, 8]"
	]
	},
	"execution_count": 40,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(dq)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 41,
	"metadata": {
	"hidden": true
	},
	"outputs": [],
	"source": [
	"dq.rotate(2)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 42,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[7, 8, 4, 6]"
	]
	},
	"execution_count": 42,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(dq) ## rotate the linked list 2 steps to the right"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"heading_collapsed": true
	},
	"source": [
	"## defaultdict"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"hidden": true
	},
	"source": [
	"of course, defaultdict is a subclass of dict. with a new input for constructor called default_factory. \n",
	"It is designed for a default value of the dict when key is missing. Namely everytime the accessed key is missing,\n",
	"the default_factory() is called."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 45,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"defaultdict(list, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})"
	]
	},
	"execution_count": 45,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]\n",
	"d = collections.defaultdict(list)\n",
	"[d[k].append(v) for k,v in s]\n",
	"d"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 46,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[]"
	]
	},
	"execution_count": 46,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"## to understand the above code, just recall that\n",
	"list() ## the is the function called each time a missing key is accessed"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 47,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"{'blue': [2, 4], 'red': [1], 'yellow': [1, 3]}"
	]
	},
	"execution_count": 47,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"## the usage above is equivalent to\n",
	"d = {}\n",
	"for k, v in s:\n",
	" d.setdefault(k, []).append(v)\n",
	"d"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 49,
	"metadata": {
	"hidden": true
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(0, set(), ())"
	]
	},
	"execution_count": 49,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"int(),set(),tuple() ## lots of interesting default factory can be used and explored!"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## namedtuple"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"assign name to tuple, which is better understood compared to access by index pos"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 52,
	"metadata": {},
	"outputs": [],
	"source": [
	"Point = collections.namedtuple('Point','x y') # for the field name ['x', 'y'], 'x y', 'x,y' are all legal formats\n",
	"p=Point(11,22)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 53,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(11, 22)"
	]
	},
	"execution_count": 53,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"p.x, p.y"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 54,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(11, 22)"
	]
	},
	"execution_count": 54,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"p[0], p[1] ## the old fashioned way in tuple also works"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 57,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"33"
	]
	},
	"execution_count": 57,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"p = p._replace(x=33) ## tuple finally can be changed\n",
	"p.x"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 59,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"('x', 'y')"
	]
	},
	"execution_count": 59,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"p._fields ## view field names"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## OrderedDict"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"A dict in python is implemented by hash to obtain O(1) speed with the cost that the keys cannot be sorted.\n",
	"An orderdict, well, is a dict remembering order. (order of keys that first inserted)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 61,
	"metadata": {},
	"outputs": [],
	"source": [
	"od = collections.OrderedDict([(1,2),(3,4)])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 64,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(2, 4)"
	]
	},
	"execution_count": 64,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"od[1], od[3]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 67,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(3, 4)"
	]
	},
	"execution_count": 67,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"od.popitem(last=True) # pop in LIFO fashion if last is True (by default), otherwise pop in FIFO order"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 68,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"OrderedDict([(1, 2)])"
	]
	},
	"execution_count": 68,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"od"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}