tanelikaivola/Map_vs_for_loop.ipynb

## Map_vs_for_loop.ipynb
{
 "metadata": {
  "name": "Map_vs_for_loop"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Map vs For Loop"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "I've been looking at a lot of mapping functions today and I thought I would see how maps fare against a simple for loop."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def convert_for1(data):\n",
      "    output = []\n",
      "    for line in data:\n",
      "        x,y = line.split(',')\n",
      "        output+=[[float(x),float(y)]]\n",
      "    return output\n",
      "\n",
      "def convert_for2(data):\n",
      "    output = []\n",
      "    for line in data:\n",
      "        x,y = line.split(',')\n",
      "        output.append([[float(x),float(y)]])\n",
      "    return output\n",
      "\n",
      "def convert_map(data):\n",
      "    \"\"\"This is NOT the same as convert_for*, this takes any number of comma separated values in one data item\"\"\"\n",
      "    return [list(map(float, x.split(','))) for x in data]\n",
      "\n",
      "def convert_list_comprehension(data):\n",
      "    \"\"\"This is the same as convert_map, just faster and more readable\"\"\"\n",
      "    return [[float(y) for y in x.split(',')] for x in data]\n",
      "\n",
      "def convert_flattening_list_comprehension(data):\n",
      "    \"\"\"Ok, this is not equivalent, but it's even faster (* when given enough data)\"\"\"\n",
      "    return [float(y) for x in data for y in x.split(',')]\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Well I have to admit that the code looks a *lot* cleaner using the mapping function."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "text = \"\"\" 1,2\n",
      "           2,3\n",
      "           3,4\"\"\"\n",
      "data = text.splitlines()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Checking to see that they return the same thing..."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "convert_for1(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 3,
       "text": [
        "[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "convert_for2(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 4,
       "text": [
        "[[[1.0, 2.0]], [[2.0, 3.0]], [[3.0, 4.0]]]"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "convert_map(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 5,
       "text": [
        "[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "convert_list_comprehension(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 6,
       "text": [
        "[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "convert_flattening_list_comprehension(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 7,
       "text": [
        "[1.0, 2.0, 2.0, 3.0, 3.0, 4.0]"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let check on the speed now.  Does few lines of code mean a faster function?"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit convert_for1(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100000 loops, best of 3: 5.21 \u00b5s per loop\n"
       ]
      }
     ],
     "prompt_number": 8
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit convert_for2(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100000 loops, best of 3: 5.13 \u00b5s per loop\n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit convert_map(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100000 loops, best of 3: 6.79 \u00b5s per loop\n"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit convert_list_comprehension(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100000 loops, best of 3: 4.81 \u00b5s per loop\n"
       ]
      }
     ],
     "prompt_number": 11
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit convert_flattening_list_comprehension(data)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100000 loops, best of 3: 3.94 \u00b5s per loop\n"
       ]
      }
     ],
     "prompt_number": 12
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "> Well I guess that settles it for me.  I find the convert_for1 loop the most readable, and now I see that it is just faster too now.  That is the approach I'm using from now on.\n",
      "\n",
      "Can I please change your mind?\n",
      "\n",
      "> Oh, I know that using numba or cython would be faster, but this is the approach I'll use unless I am looking for real speed.\n"
     ]
    }
   ],
   "metadata": {}
  }
 ]
}
	{
	"metadata": {
	"name": "Map_vs_for_loop"
	},
	"nbformat": 3,
	"nbformat_minor": 0,
	"worksheets": [
	{
	"cells": [
	{
	"cell_type": "heading",
	"level": 3,
	"metadata": {},
	"source": [
	"Map vs For Loop"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"I've been looking at a lot of mapping functions today and I thought I would see how maps fare against a simple for loop."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"def convert_for1(data):\n",
	" output = []\n",
	" for line in data:\n",
	" x,y = line.split(',')\n",
	" output+=[[float(x),float(y)]]\n",
	" return output\n",
	"\n",
	"def convert_for2(data):\n",
	" output = []\n",
	" for line in data:\n",
	" x,y = line.split(',')\n",
	" output.append([[float(x),float(y)]])\n",
	" return output\n",
	"\n",
	"def convert_map(data):\n",
	" \"\"\"This is NOT the same as convert_for*, this takes any number of comma separated values in one data item\"\"\"\n",
	" return [list(map(float, x.split(','))) for x in data]\n",
	"\n",
	"def convert_list_comprehension(data):\n",
	" \"\"\"This is the same as convert_map, just faster and more readable\"\"\"\n",
	" return [[float(y) for y in x.split(',')] for x in data]\n",
	"\n",
	"def convert_flattening_list_comprehension(data):\n",
	" \"\"\"Ok, this is not equivalent, but it's even faster (* when given enough data)\"\"\"\n",
	" return [float(y) for x in data for y in x.split(',')]\n"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 1
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Well I have to admit that the code looks a lot cleaner using the mapping function."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"text = \"\"\" 1,2\n",
	" 2,3\n",
	" 3,4\"\"\"\n",
	"data = text.splitlines()"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 2
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Checking to see that they return the same thing..."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"convert_for1(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 3,
	"text": [
	"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
	]
	}
	],
	"prompt_number": 3
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"convert_for2(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 4,
	"text": [
	"[[[1.0, 2.0]], [[2.0, 3.0]], [[3.0, 4.0]]]"
	]
	}
	],
	"prompt_number": 4
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"convert_map(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 5,
	"text": [
	"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
	]
	}
	],
	"prompt_number": 5
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"convert_list_comprehension(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 6,
	"text": [
	"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
	]
	}
	],
	"prompt_number": 6
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"convert_flattening_list_comprehension(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 7,
	"text": [
	"[1.0, 2.0, 2.0, 3.0, 3.0, 4.0]"
	]
	}
	],
	"prompt_number": 7
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Let check on the speed now. Does few lines of code mean a faster function?"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%timeit convert_for1(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"100000 loops, best of 3: 5.21 \u00b5s per loop\n"
	]
	}
	],
	"prompt_number": 8
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%timeit convert_for2(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"100000 loops, best of 3: 5.13 \u00b5s per loop\n"
	]
	}
	],
	"prompt_number": 9
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%timeit convert_map(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"100000 loops, best of 3: 6.79 \u00b5s per loop\n"
	]
	}
	],
	"prompt_number": 10
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%timeit convert_list_comprehension(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"100000 loops, best of 3: 4.81 \u00b5s per loop\n"
	]
	}
	],
	"prompt_number": 11
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"%timeit convert_flattening_list_comprehension(data)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"100000 loops, best of 3: 3.94 \u00b5s per loop\n"
	]
	}
	],
	"prompt_number": 12
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"> Well I guess that settles it for me. I find the convert_for1 loop the most readable, and now I see that it is just faster too now. That is the approach I'm using from now on.\n",
	"\n",
	"Can I please change your mind?\n",
	"\n",
	"> Oh, I know that using numba or cython would be faster, but this is the approach I'll use unless I am looking for real speed.\n"
	]
	}
	],
	"metadata": {}
	}
	]
	}