Skip to content

Instantly share code, notes, and snippets.

@anandology anandology/day1.ipynb
Last active Aug 29, 2015

Embed
What would you like to do?
Advanced Python Training at LinkedIn -- Feb 26 - Mar 1, 2014
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Advanced Python Training - Day 3\n",
"Linked In<br/>\n",
"Feb 26 - Mar 1, 2014<br/>\n",
"<a href=\"http://anandology.com\">Anand Chitipothu</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Iterators and Generators"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for i in [1, 2, 3, 4]:\n",
" print i"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for c in \"python\":\n",
" print c"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"p\n",
"y\n",
"t\n",
"h\n",
"o\n",
"n\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for x in (1, 2, 3):\n",
" print x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for k in {\"x\": 1, \"y\": 2}:\n",
" print k"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"y\n",
"x\n"
]
}
],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file numbers.txt\n",
"one\n",
"two\n",
"three"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Writing numbers.txt\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for x in open(\"numbers.txt\"):\n",
" print x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"one\n",
"\n",
"two\n",
"\n",
"three\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print \"a\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"a\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print \"1\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print repr(\"1\"), repr(1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"'1' 1\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print str(\"1\"), str(1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 1\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Many built-in functions work with any type of sequence."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"max([\"a\", \"b\", \"c\"])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
"'c'"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"max(open(\"numbers.txt\"))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"'two\\n'"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"max(open(\"numbers.txt\"), key=len)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
"'three'"
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"-\".join([\"a\", \"b\"])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 15,
"text": [
"'a-b'"
]
}
],
"prompt_number": 15
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"-\".join({\"x\": 1, \"y\": 2})"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
"'y-x'"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The iteration protocol"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = iter([1, 2, 3])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
"<listiterator at 0x10279a8d0>"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 19,
"text": [
"1"
]
}
],
"prompt_number": 19
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
"2"
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 21,
"text": [
"3"
]
}
],
"prompt_number": 21
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "StopIteration",
"evalue": "",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-22-e05f366da090>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mStopIteration\u001b[0m: "
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"xrange(10)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 23,
"text": [
"xrange(10)"
]
}
],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for i in xrange(3):\n",
" print i"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"0\n",
"1\n",
"2\n"
]
}
],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 25,
"text": [
"[0, 1, 2]"
]
}
],
"prompt_number": 25
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = range(3)\n",
"x.append(4)\n",
"print x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[0, 1, 2, 4]\n"
]
}
],
"prompt_number": 26
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x2 = xrange(3)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print x2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"xrange(3)\n"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How for loop is implemented?\n",
"\n",
" for a in x:\n",
" print a\n",
" \n",
"What is equvilant while loop of the above?\n",
"\n",
"\n",
" _it = iter(x)\n",
" while True:\n",
" try:\n",
" a = _it.next()\n",
" except StopIteration:\n",
" break\n",
" print a\n",
" \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: An Iterable class**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class Foo:\n",
" pass\n",
"\n",
"for a in Foo():\n",
" print a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "TypeError",
"evalue": "iteration over non-sequence",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-30-9311efd7cd5c>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0ma\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mFoo\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0;32mprint\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: iteration over non-sequence"
]
}
],
"prompt_number": 30
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class Foo:\n",
" def __iter__(self):\n",
" \"\"\"The presence of __iter__ method makes \n",
" a class iterable.\n",
" This functon should return an iterator.\n",
" \"\"\"\n",
" # Return a iterator from a list\n",
" return iter([1, 2, 3])\n",
"\n",
"for a in Foo():\n",
" print a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n"
]
}
],
"prompt_number": 32
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Iterable is an object that has `__iter__` method.\n",
"* Iterator is an object that has `next` method."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class yrange_iterator:\n",
" def __init__(self, n):\n",
" self.n = n\n",
" self.i = 0\n",
" \n",
" def __iter__(self):\n",
" return self\n",
" \n",
" def next(self):\n",
" i = self.i\n",
" if i < self.n:\n",
" self.i += 1\n",
" return i\n",
" else:\n",
" raise StopIteration()\n",
" \n",
"yit = yrange_iterator(3)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 38
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"yit.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 39,
"text": [
"0"
]
}
],
"prompt_number": 39
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"yit.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 40,
"text": [
"1"
]
}
],
"prompt_number": 40
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"yit.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 41,
"text": [
"2"
]
}
],
"prompt_number": 41
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"yit.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "StopIteration",
"evalue": "",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-42-f72863aefda0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0myit\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-38-f1e4a0643282>\u001b[0m in \u001b[0;36mnext\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mStopIteration\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 16\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0myit\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0myrange_iterator\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mStopIteration\u001b[0m: "
]
}
],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"yit = yrange_iterator(3)\n",
"for i in yit:\n",
" print i"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"0\n",
"1\n",
"2\n"
]
}
],
"prompt_number": 43
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Implement an iterator `countdown(n)` to start counting from `n` down to `1`.\n",
"\n",
" >>> for i in countdown(5):\n",
" ... print i\n",
" ...\n",
" 5\n",
" 4\n",
" 3\n",
" 2\n",
" 1"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file yrange.py\n",
"\n",
"class yrange_iterator:\n",
" def __init__(self, n):\n",
" self.n = n\n",
" self.i = 0\n",
" \n",
" def __iter__(self):\n",
" print \"yrange_iterator.__iter__\" \n",
" return self\n",
" \n",
" def next(self):\n",
" print \"yrange_iterator.next\" \n",
" i = self.i\n",
" if i < self.n:\n",
" self.i += 1\n",
" return i\n",
" else:\n",
" raise StopIteration()\n",
"\n",
"class yrange:\n",
" def __init__(self, n):\n",
" self.n = n\n",
" \n",
" def __iter__(self):\n",
" print \"yrange.__iter__\"\n",
" return yrange_iterator(self.n)\n",
" \n",
"y = yrange(3) \n",
"for i in y:\n",
" print i\n",
"\n",
"for i in y:\n",
" print i\n",
" "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting yrange.py\n"
]
}
],
"prompt_number": 46
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python yrange.py"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"yrange.__iter__\r\n",
"yrange_iterator.next\r\n",
"0\r\n",
"yrange_iterator.next\r\n",
"1\r\n",
"yrange_iterator.next\r\n",
"2\r\n",
"yrange_iterator.next\r\n",
"yrange.__iter__\r\n",
"yrange_iterator.next\r\n",
"0\r\n",
"yrange_iterator.next\r\n",
"1\r\n",
"yrange_iterator.next\r\n",
"2\r\n",
"yrange_iterator.next\r\n"
]
}
],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generators"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def squares(x):\n",
" for a in x:\n",
" yield a*a\n",
" \n",
"for a in squares([1, 2, 3]):\n",
" print a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"4\n",
"9\n"
]
}
],
"prompt_number": 49
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"y = squares([1, 2, 3])\n",
"print y.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n"
]
}
],
"prompt_number": 51
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print y.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"4\n"
]
}
],
"prompt_number": 52
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print y.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"9\n"
]
}
],
"prompt_number": 53
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print y.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "StopIteration",
"evalue": "",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-54-057495049b45>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mStopIteration\u001b[0m: "
]
}
],
"prompt_number": 54
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def f():\n",
" print \"f is called\"\n",
" yield 1\n",
" print \"after yield 1\"\n",
" yield 2\n",
" print \"after yield 2\""
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 55
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = f()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 56
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 57,
"text": [
"<generator object f at 0x103565d70>"
]
}
],
"prompt_number": 57
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"f is called\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 58,
"text": [
"1"
]
}
],
"prompt_number": 58
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"after yield 1\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 59,
"text": [
"2"
]
}
],
"prompt_number": 59
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.next()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "StopIteration",
"evalue": "",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-60-e05f366da090>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mStopIteration\u001b[0m: "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"after yield 2\n"
]
}
],
"prompt_number": 60
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Implement `countdown` as a generator."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Implement a function `to_int` that takes a sequence of numbers and converts each value to integer and return them as an iterator.\n",
"\n",
" >>> sum(to_int([\"1\", \"2\", \"3\", \"4\"]))\n",
" 10\n",
" >>> list(to_int([\"1\", \"2\", \"3\", \"4\"]))\n",
" [1, 2, 3, 4]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file numbers.txt\n",
"1\n",
"2\n",
"3\n",
"4\n",
"5"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting numbers.txt\n"
]
}
],
"prompt_number": 61
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" >>> sum(to_int(open(\"numbers.txt\")))\n",
" 15"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `iterjoin` to join two iterators. \n",
"\n",
" >>> it1 = iter([1, 2, 3])\n",
" >>> it2 = iter([4, 5])\n",
" >>> list(iterjoin(it1, it2))\n",
" [1, 2, 3, 4, 5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `iterappend` to append a value at the end of an iterator.\n",
"\n",
" >>> it1 = iter([1, 2, 3])\n",
" >>> list(iterappend(it1, 4))\n",
" [1, 2, 3, 4]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a fucntion `get_paragraphs` that takes an iter of lines and returns an iter of paragraphs.\n",
"\n",
"You can use http://anandology.com/tmp/sachin.txt as an sample input.\n",
"\n",
" >>> len(list(get_paragraphs(open(\"sachin.txt\"))))\n",
" 13\n",
" >>> p1 = get_paragraphs(open(\"sachin.txt\")).next()\n",
" >>> len(p1)\n",
" 238\n",
" >>> p1[:10]\n",
" 'Sachin Ten'\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file paras.txt\n",
"one\n",
"two\n",
"\n",
"three\n",
"four\n",
"five\n",
"\n",
"six"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Writing paras.txt\n"
]
}
],
"prompt_number": 62
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" >>> for p in get_paragraphs(open(\"paras.txt\")):\n",
" ... print repr(p)\n",
" ...\n",
" 'one\\ntwo\\n'\n",
" 'three\\nfour\\nfive\\n'\n",
" 'six\\n'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The input the function need not be a file. For example, the following \n",
"should also work fine.\n",
"\n",
" def upper(lines):\n",
" for line in lines:\n",
" yield line.upper()\n",
" \n",
" paras = get_paragraphs(upper(open(\"paras.txt\")))\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def get_paragraphs(lines):\n",
" para = ''\n",
" for line in lines:\n",
" if not line.strip():\n",
" yield para\n",
" para = ''\n",
" else:\n",
" para = para + line\n",
" yield para\n",
" \n",
"\n",
"for p in get_paragraphs(open(\"paras.txt\")):\n",
" print repr(p)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"'one\\ntwo\\n'\n",
"'three\\nfour\\nfive\\n'\n",
"'six'\n"
]
}
],
"prompt_number": 63
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generator Expressions"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[x*x for x in range(10)]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 65,
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
}
],
"prompt_number": 65
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"squares = (x*x for x in range(10))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 66
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"squares"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 67,
"text": [
"<generator object <genexpr> at 0x1037021e0>"
]
}
],
"prompt_number": 67
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"sum((x*x for x in range(1000000)))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 69,
"text": [
"333332833333500000"
]
}
],
"prompt_number": 69
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"sum(x*x for x in range(1000000))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 70,
"text": [
"333332833333500000"
]
}
],
"prompt_number": 70
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def squares(numbers):\n",
" return (n*n for n in numbers)\n",
"\n",
"print list(squares(range(10)))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n"
]
}
],
"prompt_number": 71
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Implement a function `evens` that takes a sequence of numbers as arguemnt and returns an iterator over the even numbers from it.\n",
"\n",
" >>> list(evens(range(10)))\n",
" [0, 2, 4, 6, 8]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `count` to count the number of elements in an iterator/iterable.\n",
"\n",
" >>> count([1, 2, 3])\n",
" 3\n",
" >>> count(x*x for x in range(10))\n",
" 10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `linecount`, that takes a filename as argument and returns the number of lines in that file."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `read_words` that takes a sequence of lines as argument and returns an iterator over words."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `wordcount`, that takes a filename as argument and returns the number of words in that file."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def count(it):\n",
" result = 0\n",
" for x in it:\n",
" result += 1\n",
" return result\n",
"\n",
"print count([1, 2, 3])\n",
"print count(x*x for x in range(100))\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"3\n",
"100\n"
]
}
],
"prompt_number": 72
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def count1(it):\n",
" return sum(1 for x in it)\n",
"\n",
"print count1([1, 2, 3])\n",
"print count1(x*x for x in range(100))\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"3\n",
"100\n"
]
}
],
"prompt_number": 73
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: Reading multiple files**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets say we want to implement a function `cat` that takes a list of filenames and prints all the contents."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def cat(filenames):\n",
" for f in filenames:\n",
" for line in open(f):\n",
" print line,"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 74
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What if we want to do grep?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def grep(pattern, filenames):\n",
" for f in filenames:\n",
" for line in open(f):\n",
" if pattern in line:\n",
" print line, "
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 75
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file files.py\n",
"\n",
"import gzip\n",
"def xopen(filename):\n",
" if filename.endswith(\".gz\"):\n",
" return gzip.open(filename)\n",
" else:\n",
" return open(filename)\n",
"\n",
"def read_files(filenames):\n",
" for f in filenames:\n",
" for line in xopen(f):\n",
" yield line\n",
"\n",
"def print_lines(lines):\n",
" for line in lines:\n",
" print line,\n",
" \n",
"def cat(filenames):\n",
" lines = read_files(filenames)\n",
" print_lines(lines)\n",
" \n",
"def grep(pattern, filenames):\n",
" lines = read_files(filenames)\n",
" print_lines(line for line in lines if pattern in lines)\n",
" "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting files.py\n"
]
}
],
"prompt_number": 78
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `findfiles` that takes a directory as argument and returns an iterator over all files in that directory tree.\n",
"\n",
"Hint: use `os.walk`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `count_pyfiles` that takes a directory as argument and computes the number of python files in that directory tree."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `count_pylines` that takes a directory as argument and computes the number of lines in all python files in that directory tree."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import os\n",
"for dirname, dirs, files in os.walk(\"flask-training/clickr\"):\n",
" print dirname, files"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"flask-training/clickr ['run.py']\n",
"flask-training/clickr/clickr ['__init__.py', '__init__.pyc', 'models.py', 'settings.py', 'webapp.py', 'webapp.pyc']\n",
"flask-training/clickr/clickr/templates ['base.html', 'index.html', 'photo.html', 'user.html']\n"
]
}
],
"prompt_number": 83
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def find_files(dir):\n",
" for dirname, dirs, files in os.walk(dir):\n",
" for f in files:\n",
" yield os.path.join(dirname, f)\n",
" \n",
"def find_files_by_extn(dir, ext):\n",
" return (f for f in find_files(dir) if f.endswith(ext))\n",
" \n",
"def count_pyfiles(dir):\n",
" pyfiles = find_files_by_extn(dir, \".py\")\n",
" return count(pyfiles)\n",
"\n",
"from files import read_files\n",
"\n",
"def count_pylines(dir):\n",
" pyfiles = find_files_by_extn(dir, \".py\")\n",
" pylines = read_files(pyfiles)\n",
" return count(pylines)\n",
" \n",
"print count_pyfiles(\".\")\n",
"print count_pylines(\".\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"36\n",
"857\n"
]
}
],
"prompt_number": 89
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Further Reading**\n",
"\n",
"* [Generator Tricks for Systems Programmers](http://www.dabeaz.com/generators/) by [David Beazly](http://www.dabeaz.com/) is an excellent in-depth introduction to generators. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**itertools module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.