Paddy3118/infinity_and_beyond.ipynb

## infinity_and_beyond.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Infinite,-lazy,--data-structures:-examples-from-Haskell-to-Python\" data-toc-modified-id=\"Infinite,-lazy,--data-structures:-examples-from-Haskell-to-Python-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Infinite, lazy,  data structures: examples from Haskell to Python</a></div><div class=\"lev2 toc-item\"><a href=\"#A-simpler,-finite,-list\" data-toc-modified-id=\"A-simpler,-finite,-list-1.1\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>A simpler, finite, list</a></div><div class=\"lev2 toc-item\"><a href=\"#Python-infinite-lists\" data-toc-modified-id=\"Python-infinite-lists-1.2\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Python infinite lists</a></div><div class=\"lev3 toc-item\"><a href=\"#You-can't-sum-an-infinite-datastructure\" data-toc-modified-id=\"You-can't-sum-an-infinite-datastructure-1.2.1\"><span class=\"toc-item-num\">1.2.1&nbsp;&nbsp;</span>You can't sum an infinite datastructure</a></div><div class=\"lev3 toc-item\"><a href=\"#You-can-filter-an-infinite-list-however\" data-toc-modified-id=\"You-can-filter-an-infinite-list-however-1.2.2\"><span class=\"toc-item-num\">1.2.2&nbsp;&nbsp;</span>You can filter an infinite list however</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Naive\" data-toc-modified-id=\"Generate-all-primes:-Naive-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Generate all primes: Naive</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Sieve\" data-toc-modified-id=\"Generate-all-primes:-Sieve-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Generate all primes: Sieve</a></div><div class=\"lev2 toc-item\"><a href=\"#Comparison\" data-toc-modified-id=\"Comparison-3.1\"><span class=\"toc-item-num\">3.1&nbsp;&nbsp;</span>Comparison</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Iterative-sieve\" data-toc-modified-id=\"Generate-all-primes:-Iterative-sieve-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Generate all primes: Iterative sieve</a></div><div class=\"lev2 toc-item\"><a href=\"#Comparison\" data-toc-modified-id=\"Comparison-4.1\"><span class=\"toc-item-num\">4.1&nbsp;&nbsp;</span>Comparison</a></div><div class=\"lev1 toc-item\"><a href=\"#Rosetta-code-task-examples\" data-toc-modified-id=\"Rosetta-code-task-examples-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span><a href=\"http://rosettacode.org/wiki/Extensible_prime_generator\" target=\"_blank\">Rosetta code</a> task examples</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-first-twenty-primes.\" data-toc-modified-id=\"Show-the-first-twenty-primes.-5.1\"><span class=\"toc-item-num\">5.1&nbsp;&nbsp;</span>Show the first twenty primes.</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-primes-between-100-and-150\" data-toc-modified-id=\"Show-the-primes-between-100-and-150-5.2\"><span class=\"toc-item-num\">5.2&nbsp;&nbsp;</span>Show the primes between 100 and 150</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-number-of-primes-between-7,700-and-8,000.\" data-toc-modified-id=\"Show-the-number-of-primes-between-7,700-and-8,000.-5.3\"><span class=\"toc-item-num\">5.3&nbsp;&nbsp;</span>Show the number of primes between 7,700 and 8,000.</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-10,000th-prime.\" data-toc-modified-id=\"Show-the-10,000th-prime.-5.4\"><span class=\"toc-item-num\">5.4&nbsp;&nbsp;</span>Show the 10,000th prime.</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " # Infinite, lazy,  data structures: examples from Haskell to Python\n",
    " \n",
    " From this computerphile video [here](https://www.youtube.com/watch?v=bnRNiE_OVWA) - but with the examples converted to Python.\n",
    " \n",
    " \n",
    " ## A simpler, finite, list\n",
    " Python rangesact like its indexing in that it starts from zero and its ranges go up to, but less than the end value.\n",
    " \n",
    " The range of 20 integers from one to twenty is given by:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "range(1, 21)"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "range(1, 21)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ranges are there own data structure that can be converted to a list of actual values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(range(1, 21))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can sum the range of numbers like so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "210"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(range(1, 21))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Taking the squares of all the values, in a way that creats a list can be done using a **list comprehension**<br>\n",
    "(The map function *does* exist in Python however)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1,\n",
       " 4,\n",
       " 9,\n",
       " 16,\n",
       " 25,\n",
       " 36,\n",
       " 49,\n",
       " 64,\n",
       " 81,\n",
       " 100,\n",
       " 121,\n",
       " 144,\n",
       " 169,\n",
       " 196,\n",
       " 225,\n",
       " 256,\n",
       " 289,\n",
       " 324,\n",
       " 361,\n",
       " 400]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[x**2 for x in range(1, 21)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One can filter to keep all the even numbers using a conditional clause in the list comprehension.<br>\n",
    "(Python does have the filterfalse function in the itertools library which is more like the Haskell example)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[x for x in range(1, 21) if x % 2 == 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The sum of the squares of all the even numbers between 1 and 20 is given by the combination"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1540"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(x**2 for x in range(1, 21) if x % 2 == 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Python infinite lists\n",
    "Are not really lists, but iterators that can iterate over an infinite succession of values.\n",
    "\n",
    "This is where we will use members of the itertools library so lets slurp them in"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "from itertools import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To represent all the integers from 1 to infinity we can use count"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count(1)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "count(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Count is a data structure, not a list. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can show the first ten items from the count using islice and list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice(count(1), 10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### You can't sum an infinite datastructure\n",
    "Any function that needs to read to the end of an infinite data structure to produce its output will just hang as it goes through successive items without end. You therefore cannot use sum() on the infinite count.\n",
    "\n",
    "### You can filter an infinite list however\n",
    "This will produce another **infinite** iterator of filtered values. (We use islice and list to show the first few) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice((x for x in count(1) if x % 2 == 0), 10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "the inner list comprehension will produce successive values; the islice call returns a finite number of results to the list, so the above command terminates."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generate all primes: Naive\n",
    "A prime has factors of only one and itself."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def factors(n):\n",
    "    return [x for x in range(1, n +1) if n % x == 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 7]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "factors(7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 3, 5, 15]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "factors(15)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def isprime(n):\n",
    "    return factors(n) == [1, n]\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "isprime(7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "isprime(17)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can create a generator of all prime numbers by filtering the count of integers greater than 0, like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<generator object <genexpr> at 0x00000298CF0F7780>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(x for x in count(1) if isprime(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can show the first 10 primes so generated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice((x for x in count(1) if isprime(x)), 10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets get a time to generate the first 1_000 primes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 2.6 s per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit list(islice((x for x in count(1) if isprime(x)), 1_000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generate all primes: Sieve\n",
    "Using the sieve of Eratosthenes on an infinite stream\n",
    "\n",
    "**Steps**\n",
    "\n",
    "1. Generate the count 2, 3, ...\n",
    "2. Mark the first value, p, as prime.\n",
    "3. Remove all multipliers of p from the count.\n",
    "4. Return to step 2\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def sieve(sieved):\n",
    "    prime = next(sieved)\n",
    "    yield prime\n",
    "    yield from sieve(x for x in sieved if x % prime != 0)\n",
    "    \n",
    "all_primes = sieve(count(2))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can show the first 10 primes so generated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice(sieve(count(2)), 10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets get a time to generate the first 1_000 primes:\n",
    "\n",
    "(Python does not optimise recursion so we will need to increase the recursion limit)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.setrecursionlimit(10000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10 loops, best of 3: 150 ms per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit list(islice(sieve(count(2)), 1_000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparison\n",
    "This new method is around **20 times faster** than the previous naive one, but suffers from having a low recursion limit.\n",
    "\n",
    "Better still to use iteration rather than recursion in Python\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generate all primes: Iterative sieve\n",
    "Using the sieve of Eratosthenes on an infinite stream but iteratively\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "def isieve(sieved):\n",
    "    prime = next(sieved)\n",
    "    yield prime\n",
    "    primes = [prime]\n",
    "    for x in sieved:\n",
    "        if any(x % prime == 0 for prime in primes):\n",
    "            continue\n",
    "        yield x\n",
    "        primes.append(x)\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can show the first 10 primes so generated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice(isieve(count(2)), 10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets get a time to generate the first 1_000 primes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10 loops, best of 3: 67 ms per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit list(islice(isieve(count(2)), 1_000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparison\n",
    "This new iterative sieve is over **twice as fast** as the previous recursive sieve, and **40** times faster than the naive algorithm whilst avoiding any recursion limit.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [Rosetta code](http://rosettacode.org/wiki/Extensible_prime_generator) task examples\n",
    "\n",
    "## Show the first twenty primes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(islice(isieve(count(2)), 20))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Show the primes between 100 and 150"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[101, 103, 107, 109, 113, 127, 131, 137, 139, 149]"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def leq_150(x): return x <= 150\n",
    "\n",
    "[x for x in takewhile(leq_150, isieve(count(2))) if x >= 100]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Show the number of primes between 7,700 and 8,000."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "30"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def leq_8000(x): return x <= 8000\n",
    "\n",
    "sum(1 for x in takewhile(leq_8000, isieve(count(2))) if x >= 7700)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Show the 10,000th prime."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "104729"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(islice(isieve(count(2)), 10_000-1, 10_000))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "nav_menu": {},
  "toc": {
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": 6,
   "toc_cell": true,
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"toc": "true"
	},
	"source": [
	"# Table of Contents\n",
	" <p><div class=\"lev1 toc-item\"><a href=\"#Infinite,-lazy,--data-structures:-examples-from-Haskell-to-Python\" data-toc-modified-id=\"Infinite,-lazy,--data-structures:-examples-from-Haskell-to-Python-1\"><span class=\"toc-item-num\">1  </span>Infinite, lazy, data structures: examples from Haskell to Python</a></div><div class=\"lev2 toc-item\"><a href=\"#A-simpler,-finite,-list\" data-toc-modified-id=\"A-simpler,-finite,-list-1.1\"><span class=\"toc-item-num\">1.1  </span>A simpler, finite, list</a></div><div class=\"lev2 toc-item\"><a href=\"#Python-infinite-lists\" data-toc-modified-id=\"Python-infinite-lists-1.2\"><span class=\"toc-item-num\">1.2  </span>Python infinite lists</a></div><div class=\"lev3 toc-item\"><a href=\"#You-can't-sum-an-infinite-datastructure\" data-toc-modified-id=\"You-can't-sum-an-infinite-datastructure-1.2.1\"><span class=\"toc-item-num\">1.2.1  </span>You can't sum an infinite datastructure</a></div><div class=\"lev3 toc-item\"><a href=\"#You-can-filter-an-infinite-list-however\" data-toc-modified-id=\"You-can-filter-an-infinite-list-however-1.2.2\"><span class=\"toc-item-num\">1.2.2  </span>You can filter an infinite list however</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Naive\" data-toc-modified-id=\"Generate-all-primes:-Naive-2\"><span class=\"toc-item-num\">2  </span>Generate all primes: Naive</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Sieve\" data-toc-modified-id=\"Generate-all-primes:-Sieve-3\"><span class=\"toc-item-num\">3  </span>Generate all primes: Sieve</a></div><div class=\"lev2 toc-item\"><a href=\"#Comparison\" data-toc-modified-id=\"Comparison-3.1\"><span class=\"toc-item-num\">3.1  </span>Comparison</a></div><div class=\"lev1 toc-item\"><a href=\"#Generate-all-primes:-Iterative-sieve\" data-toc-modified-id=\"Generate-all-primes:-Iterative-sieve-4\"><span class=\"toc-item-num\">4  </span>Generate all primes: Iterative sieve</a></div><div class=\"lev2 toc-item\"><a href=\"#Comparison\" data-toc-modified-id=\"Comparison-4.1\"><span class=\"toc-item-num\">4.1  </span>Comparison</a></div><div class=\"lev1 toc-item\"><a href=\"#Rosetta-code-task-examples\" data-toc-modified-id=\"Rosetta-code-task-examples-5\"><span class=\"toc-item-num\">5  </span><a href=\"http://rosettacode.org/wiki/Extensible_prime_generator\" target=\"_blank\">Rosetta code</a> task examples</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-first-twenty-primes.\" data-toc-modified-id=\"Show-the-first-twenty-primes.-5.1\"><span class=\"toc-item-num\">5.1  </span>Show the first twenty primes.</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-primes-between-100-and-150\" data-toc-modified-id=\"Show-the-primes-between-100-and-150-5.2\"><span class=\"toc-item-num\">5.2  </span>Show the primes between 100 and 150</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-number-of-primes-between-7,700-and-8,000.\" data-toc-modified-id=\"Show-the-number-of-primes-between-7,700-and-8,000.-5.3\"><span class=\"toc-item-num\">5.3  </span>Show the number of primes between 7,700 and 8,000.</a></div><div class=\"lev2 toc-item\"><a href=\"#Show-the-10,000th-prime.\" data-toc-modified-id=\"Show-the-10,000th-prime.-5.4\"><span class=\"toc-item-num\">5.4  </span>Show the 10,000th prime.</a></div>"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	" # Infinite, lazy, data structures: examples from Haskell to Python\n",
	" \n",
	" From this computerphile video [here](https://www.youtube.com/watch?v=bnRNiE_OVWA) - but with the examples converted to Python.\n",
	" \n",
	" \n",
	" ## A simpler, finite, list\n",
	" Python rangesact like its indexing in that it starts from zero and its ranges go up to, but less than the end value.\n",
	" \n",
	" The range of 20 integers from one to twenty is given by:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"range(1, 21)"
	]
	},
	"execution_count": 1,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"range(1, 21)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Ranges are there own data structure that can be converted to a list of actual values"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]"
	]
	},
	"execution_count": 2,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(range(1, 21))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can sum the range of numbers like so:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"210"
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"sum(range(1, 21))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Taking the squares of all the values, in a way that creats a list can be done using a list comprehension<br>\n",
	"(The map function does exist in Python however)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1,\n",
	" 4,\n",
	" 9,\n",
	" 16,\n",
	" 25,\n",
	" 36,\n",
	" 49,\n",
	" 64,\n",
	" 81,\n",
	" 100,\n",
	" 121,\n",
	" 144,\n",
	" 169,\n",
	" 196,\n",
	" 225,\n",
	" 256,\n",
	" 289,\n",
	" 324,\n",
	" 361,\n",
	" 400]"
	]
	},
	"execution_count": 4,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"[x**2 for x in range(1, 21)]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"One can filter to keep all the even numbers using a conditional clause in the list comprehension.<br>\n",
	"(Python does have the filterfalse function in the itertools library which is more like the Haskell example)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"[x for x in range(1, 21) if x % 2 == 0]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The sum of the squares of all the even numbers between 1 and 20 is given by the combination"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"1540"
	]
	},
	"execution_count": 6,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"sum(x**2 for x in range(1, 21) if x % 2 == 0)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Python infinite lists\n",
	"Are not really lists, but iterators that can iterate over an infinite succession of values.\n",
	"\n",
	"This is where we will use members of the itertools library so lets slurp them in"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [],
	"source": [
	"from itertools import *"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"To represent all the integers from 1 to infinity we can use count"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"count(1)"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"count(1)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Count is a data structure, not a list. "
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can show the first ten items from the count using islice and list"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]"
	]
	},
	"execution_count": 9,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice(count(1), 10))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### You can't sum an infinite datastructure\n",
	"Any function that needs to read to the end of an infinite data structure to produce its output will just hang as it goes through successive items without end. You therefore cannot use sum() on the infinite count.\n",
	"\n",
	"### You can filter an infinite list however\n",
	"This will produce another infinite iterator of filtered values. (We use islice and list to show the first few) "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]"
	]
	},
	"execution_count": 10,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice((x for x in count(1) if x % 2 == 0), 10))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"the inner list comprehension will produce successive values; the islice call returns a finite number of results to the list, so the above command terminates."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Generate all primes: Naive\n",
	"A prime has factors of only one and itself."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {},
	"outputs": [],
	"source": [
	"def factors(n):\n",
	" return [x for x in range(1, n +1) if n % x == 0]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1, 7]"
	]
	},
	"execution_count": 12,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"factors(7)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[1, 3, 5, 15]"
	]
	},
	"execution_count": 13,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"factors(15)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {},
	"outputs": [],
	"source": [
	"def isprime(n):\n",
	" return factors(n) == [1, n]\n",
	" "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"True"
	]
	},
	"execution_count": 15,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"isprime(7)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"True"
	]
	},
	"execution_count": 16,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"isprime(17)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can create a generator of all prime numbers by filtering the count of integers greater than 0, like this:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"<generator object <genexpr> at 0x00000298CF0F7780>"
	]
	},
	"execution_count": 17,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"(x for x in count(1) if isprime(x))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can show the first 10 primes so generated"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 18,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
	]
	},
	"execution_count": 18,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice((x for x in count(1) if isprime(x)), 10))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Lets get a time to generate the first 1_000 primes:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 19,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"1 loop, best of 3: 2.6 s per loop\n"
	]
	}
	],
	"source": [
	"%timeit list(islice((x for x in count(1) if isprime(x)), 1_000))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Generate all primes: Sieve\n",
	"Using the sieve of Eratosthenes on an infinite stream\n",
	"\n",
	"Steps\n",
	"\n",
	"1. Generate the count 2, 3, ...\n",
	"2. Mark the first value, p, as prime.\n",
	"3. Remove all multipliers of p from the count.\n",
	"4. Return to step 2\n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 25,
	"metadata": {},
	"outputs": [],
	"source": [
	"def sieve(sieved):\n",
	" prime = next(sieved)\n",
	" yield prime\n",
	" yield from sieve(x for x in sieved if x % prime != 0)\n",
	" \n",
	"all_primes = sieve(count(2))\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can show the first 10 primes so generated"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 26,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
	]
	},
	"execution_count": 26,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice(sieve(count(2)), 10))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Lets get a time to generate the first 1_000 primes:\n",
	"\n",
	"(Python does not optimise recursion so we will need to increase the recursion limit)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 45,
	"metadata": {},
	"outputs": [],
	"source": [
	"import sys\n",
	"sys.setrecursionlimit(10000)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 46,
	"metadata": {
	"scrolled": true
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"10 loops, best of 3: 150 ms per loop\n"
	]
	}
	],
	"source": [
	"%timeit list(islice(sieve(count(2)), 1_000))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Comparison\n",
	"This new method is around 20 times faster than the previous naive one, but suffers from having a low recursion limit.\n",
	"\n",
	"Better still to use iteration rather than recursion in Python\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Generate all primes: Iterative sieve\n",
	"Using the sieve of Eratosthenes on an infinite stream but iteratively\n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 58,
	"metadata": {},
	"outputs": [],
	"source": [
	"def isieve(sieved):\n",
	" prime = next(sieved)\n",
	" yield prime\n",
	" primes = [prime]\n",
	" for x in sieved:\n",
	" if any(x % prime == 0 for prime in primes):\n",
	" continue\n",
	" yield x\n",
	" primes.append(x)\n",
	" "
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can show the first 10 primes so generated"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 59,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]"
	]
	},
	"execution_count": 59,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice(isieve(count(2)), 10))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Lets get a time to generate the first 1_000 primes:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 60,
	"metadata": {
	"scrolled": true
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"10 loops, best of 3: 67 ms per loop\n"
	]
	}
	],
	"source": [
	"%timeit list(islice(isieve(count(2)), 1_000))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Comparison\n",
	"This new iterative sieve is over twice as fast as the previous recursive sieve, and 40 times faster than the naive algorithm whilst avoiding any recursion limit.\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# [Rosetta code](http://rosettacode.org/wiki/Extensible_prime_generator) task examples\n",
	"\n",
	"## Show the first twenty primes."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 62,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]"
	]
	},
	"execution_count": 62,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"list(islice(isieve(count(2)), 20))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Show the primes between 100 and 150"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 63,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"[101, 103, 107, 109, 113, 127, 131, 137, 139, 149]"
	]
	},
	"execution_count": 63,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"def leq_150(x): return x <= 150\n",
	"\n",
	"[x for x in takewhile(leq_150, isieve(count(2))) if x >= 100]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Show the number of primes between 7,700 and 8,000."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 64,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"30"
	]
	},
	"execution_count": 64,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"def leq_8000(x): return x <= 8000\n",
	"\n",
	"sum(1 for x in takewhile(leq_8000, isieve(count(2))) if x >= 7700)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Show the 10,000th prime."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 67,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"104729"
	]
	},
	"execution_count": 67,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"next(islice(isieve(count(2)), 10_000-1, 10_000))"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python [default]",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.6"
	},
	"nav_menu": {},
	"toc": {
	"navigate_menu": true,
	"number_sections": true,
	"sideBar": true,
	"threshold": 6,
	"toc_cell": true,
	"toc_section_display": "block",
	"toc_window_display": false
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}