jgrotts/parallel_loops_example.ipynb

## parallel_loops_example.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recently, I was working on image data and was running a for-loop to resize the images. This was embarrassingly parallel, but I wasn't familiar with running for-loops in parallel in python. This worksheet provides a template for for-loop parallelization. More information on parallel computing can be found [here](https://en.wikipedia.org/wiki/Parallel_computing).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np # linear algebra\n",
    "import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
    "import time\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np # linear algebra\n",
    "import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
    "import cv2 \n",
    "from tqdm import tqdm\n",
    "from joblib import Parallel, delayed\n",
    "from joblib import load, dump\n",
    "from time import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data is from the competition titled [Planet: Understanding the Amazon from Space](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space). The images used in this workbook are from the training dataset. I'm using the [cv2](http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_image_display/py_image_display.html) package to import and resize images. The [tqdm](https://pypi.python.org/pypi/tqdm) is a neat package that runs a progress bar as your loop runs. Need to use threading with the joblib package because the default *multiprocessing* option would not work on my windows machine.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████████████████████████████| 40479/40479 [00:14<00:00, 2861.35it/s]\n",
      "100%|██████████████████████████████████| 40479/40479 [00:11<00:00, 3613.34it/s]\n",
      "100%|███████████████████████████████████| 40479/40479 [00:46<00:00, 862.62it/s]\n"
     ]
    }
   ],
   "source": [
    "\n",
    "def image_resize(file_hold):    \n",
    "    img = cv2.imread('...\\\\train-jpg\\\\{}.jpg'.format(file_hold))\n",
    "    return cv2.resize(img, (32, 32))\n",
    "    \n",
    "if __name__ == '__main__':\n",
    "    \n",
    "    df_train = pd.read_csv('...\\\\train_v2.csv')\n",
    "    \n",
    "    \n",
    "    t0_4 = time()\n",
    "    results_4 = Parallel(n_jobs=4, backend=\"threading\")(delayed(image_resize)(f) for f in tqdm(df_train.values[:,0], miniters=1000)) \n",
    "    t1_4 = time()\n",
    "    \n",
    "    t0_8 = time()\n",
    "    results_8 = Parallel(n_jobs=8, backend=\"threading\")(delayed(image_resize)(f) for f in tqdm(df_train.values[:,0], miniters=1000)) \n",
    "    t1_8 = time()\n",
    "    \n",
    "    t0_0 = time()\n",
    "    results_0 = []\n",
    "    for f in tqdm(df_train.values[:,0], miniters=1000):\n",
    "        results_0.append(image_resize(f))\n",
    "    t1_0 = time()\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The run time for the for-loop without running in parallel was 46.963686\n",
      "The run time for the for-loop with parallel on 4 cores was 14.210813\n",
      "The run time for the for-loop with parallel on 8 cores was 11.293646\n"
     ]
    }
   ],
   "source": [
    "print(\"The run time for the for-loop without running in parallel was %f\" %(t1_0 - t0_0))\n",
    "print(\"The run time for the for-loop with parallel on 4 cores was %f\" %(t1_4 - t0_4))\n",
    "print(\"The run time for the for-loop with parallel on 8 cores was %f\" %(t1_8 - t0_8))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The 8-core for-loop doesn't decrease the time by half of the 4-core for-loop because there is a high cost of efficients for data i/o on the threads. This is a well-known phenomenon and means that more cores is not always better. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Recently, I was working on image data and was running a for-loop to resize the images. This was embarrassingly parallel, but I wasn't familiar with running for-loops in parallel in python. This worksheet provides a template for for-loop parallelization. More information on parallel computing can be found [here](https://en.wikipedia.org/wiki/Parallel_computing).\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"import numpy as np # linear algebra\n",
	"import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
	"import time\n",
	"import matplotlib.pyplot as plt\n",
	"import numpy as np # linear algebra\n",
	"import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
	"import cv2 \n",
	"from tqdm import tqdm\n",
	"from joblib import Parallel, delayed\n",
	"from joblib import load, dump\n",
	"from time import time"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The data is from the competition titled [Planet: Understanding the Amazon from Space](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space). The images used in this workbook are from the training dataset. I'm using the [cv2](http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_image_display/py_image_display.html) package to import and resize images. The [tqdm](https://pypi.python.org/pypi/tqdm) is a neat package that runs a progress bar as your loop runs. Need to use threading with the joblib package because the default multiprocessing option would not work on my windows machine.\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stderr",
	"output_type": "stream",
	"text": [
	"100%\|██████████████████████████████████\| 40479/40479 [00:14<00:00, 2861.35it/s]\n",
	"100%\|██████████████████████████████████\| 40479/40479 [00:11<00:00, 3613.34it/s]\n",
	"100%\|███████████████████████████████████\| 40479/40479 [00:46<00:00, 862.62it/s]\n"
	]
	}
	],
	"source": [
	"\n",
	"def image_resize(file_hold): \n",
	" img = cv2.imread('...\\\\train-jpg\\\\{}.jpg'.format(file_hold))\n",
	" return cv2.resize(img, (32, 32))\n",
	" \n",
	"if __name__ == '__main__':\n",
	" \n",
	" df_train = pd.read_csv('...\\\\train_v2.csv')\n",
	" \n",
	" \n",
	" t0_4 = time()\n",
	" results_4 = Parallel(n_jobs=4, backend=\"threading\")(delayed(image_resize)(f) for f in tqdm(df_train.values[:,0], miniters=1000)) \n",
	" t1_4 = time()\n",
	" \n",
	" t0_8 = time()\n",
	" results_8 = Parallel(n_jobs=8, backend=\"threading\")(delayed(image_resize)(f) for f in tqdm(df_train.values[:,0], miniters=1000)) \n",
	" t1_8 = time()\n",
	" \n",
	" t0_0 = time()\n",
	" results_0 = []\n",
	" for f in tqdm(df_train.values[:,0], miniters=1000):\n",
	" results_0.append(image_resize(f))\n",
	" t1_0 = time()\n",
	" "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"The run time for the for-loop without running in parallel was 46.963686\n",
	"The run time for the for-loop with parallel on 4 cores was 14.210813\n",
	"The run time for the for-loop with parallel on 8 cores was 11.293646\n"
	]
	}
	],
	"source": [
	"print(\"The run time for the for-loop without running in parallel was %f\" %(t1_0 - t0_0))\n",
	"print(\"The run time for the for-loop with parallel on 4 cores was %f\" %(t1_4 - t0_4))\n",
	"print(\"The run time for the for-loop with parallel on 8 cores was %f\" %(t1_8 - t0_8))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The 8-core for-loop doesn't decrease the time by half of the 4-core for-loop because there is a high cost of efficients for data i/o on the threads. This is a well-known phenomenon and means that more cores is not always better. "
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.5.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}