Skip to content

Instantly share code, notes, and snippets.

@tanelikaivola
Forked from anonymous/Map_vs_for_loop.ipynb
Last active December 19, 2015 08:39
Show Gist options
  • Save tanelikaivola/5927800 to your computer and use it in GitHub Desktop.
Save tanelikaivola/5927800 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "Map_vs_for_loop"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Map vs For Loop"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I've been looking at a lot of mapping functions today and I thought I would see how maps fare against a simple for loop."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def convert_for1(data):\n",
" output = []\n",
" for line in data:\n",
" x,y = line.split(',')\n",
" output+=[[float(x),float(y)]]\n",
" return output\n",
"\n",
"def convert_for2(data):\n",
" output = []\n",
" for line in data:\n",
" x,y = line.split(',')\n",
" output.append([[float(x),float(y)]])\n",
" return output\n",
"\n",
"def convert_map(data):\n",
" \"\"\"This is NOT the same as convert_for*, this takes any number of comma separated values in one data item\"\"\"\n",
" return [list(map(float, x.split(','))) for x in data]\n",
"\n",
"def convert_list_comprehension(data):\n",
" \"\"\"This is the same as convert_map, just faster and more readable\"\"\"\n",
" return [[float(y) for y in x.split(',')] for x in data]\n",
"\n",
"def convert_flattening_list_comprehension(data):\n",
" \"\"\"Ok, this is not equivalent, but it's even faster (* when given enough data)\"\"\"\n",
" return [float(y) for x in data for y in x.split(',')]\n"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well I have to admit that the code looks a *lot* cleaner using the mapping function."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"text = \"\"\" 1,2\n",
" 2,3\n",
" 3,4\"\"\"\n",
"data = text.splitlines()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Checking to see that they return the same thing..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"convert_for1(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 3,
"text": [
"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"convert_for2(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 4,
"text": [
"[[[1.0, 2.0]], [[2.0, 3.0]], [[3.0, 4.0]]]"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"convert_map(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 5,
"text": [
"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
]
}
],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"convert_list_comprehension(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 6,
"text": [
"[[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]]"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"convert_flattening_list_comprehension(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 7,
"text": [
"[1.0, 2.0, 2.0, 3.0, 3.0, 4.0]"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let check on the speed now. Does few lines of code mean a faster function?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit convert_for1(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100000 loops, best of 3: 5.21 \u00b5s per loop\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit convert_for2(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100000 loops, best of 3: 5.13 \u00b5s per loop\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit convert_map(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100000 loops, best of 3: 6.79 \u00b5s per loop\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit convert_list_comprehension(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100000 loops, best of 3: 4.81 \u00b5s per loop\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit convert_flattening_list_comprehension(data)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100000 loops, best of 3: 3.94 \u00b5s per loop\n"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> Well I guess that settles it for me. I find the convert_for1 loop the most readable, and now I see that it is just faster too now. That is the approach I'm using from now on.\n",
"\n",
"Can I please change your mind?\n",
"\n",
"> Oh, I know that using numba or cython would be faster, but this is the approach I'll use unless I am looking for real speed.\n"
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment