Skip to content

Instantly share code, notes, and snippets.

@bilzard
Created November 18, 2022 11:15
Show Gist options
  • Save bilzard/d58b8c84a8f0125ab46970f9429b6580 to your computer and use it in GitHub Desktop.
Save bilzard/d58b8c84a8f0125ab46970f9429b6580 to your computer and use it in GitHub Desktop.
filter演算の速度比較
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "56ae9290-4ece-4fe3-8d25-c57b80dbc07e",
"metadata": {},
"source": [
"# listとnumpy arrayでfilter演算の速度を比較する\n",
"\n",
"以下の4つの書き方で速度を比較した。\n",
"\n",
"1. `result[:] = filter(<filter_func>, xs)`\n",
"2. `result[:] = (x for x in xs if <filter_cond>)`\n",
"3. `result[:] = [x for x in xs if <filter_cond>]`\n",
"4. `result = xs[<filter_cond>]`"
]
},
{
"cell_type": "code",
"execution_count": 131,
"id": "5a25ccfe-8c91-4aaf-98a2-4f23b72053b2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The lab_black extension is already loaded. To reload it, use:\n",
" %reload_ext lab_black\n"
]
}
],
"source": [
"%load_ext lab_black\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"\n",
"plt.style.use(\"ggplot\")"
]
},
{
"cell_type": "markdown",
"id": "f1b20d51-1e55-47f5-ab19-a37ee21c26c3",
"metadata": {},
"source": [
"c.f.) timeit -o:\n",
"https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit"
]
},
{
"cell_type": "code",
"execution_count": 123,
"id": "0a3745a1-2cf3-4674-ab14-90896a015327",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4.74 ms ± 104 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"3.86 ms ± 6.22 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"3.67 ms ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"332 µs ± 183 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
]
}
],
"source": [
"sim_result = {}\n",
"\n",
"# filter\n",
"xs = np.random.randint(0, 256, 100_000)\n",
"result = []\n",
"sim_result[\"filter\"] = %timeit -o result[:] = filter(lambda x: x > 128, xs)\n",
"\n",
"# tuple comprehension\n",
"xs = np.random.randint(0, 256, 100_000)\n",
"result = []\n",
"sim_result[\"tuple comprehension\"] = %timeit -o result[:] = (x for x in xs if x > 128)\n",
"\n",
"# list comprehension\n",
"xs = np.random.randint(0, 256, 100_000)\n",
"result = []\n",
"sim_result[\"list comprehension\"] = %timeit -o result[:] = [x for x in xs if x > 128]\n",
"\n",
"# filter with numpy\n",
"xs = np.random.randint(0, 256, 100_000)\n",
"sim_result[\"numpy\"] = %timeit -o result = xs[xs > 128]"
]
},
{
"cell_type": "code",
"execution_count": 153,
"id": "6c85aec6-59de-47c5-9e5f-39bbbae880f6",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>mean</th>\n",
" <th>std</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>filter</td>\n",
" <td>0.004742</td>\n",
" <td>1.040453e-04</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>tuple comprehension</td>\n",
" <td>0.003856</td>\n",
" <td>6.223687e-06</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>list comprehension</td>\n",
" <td>0.003672</td>\n",
" <td>1.044400e-06</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>numpy</td>\n",
" <td>0.000332</td>\n",
" <td>1.827475e-07</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name mean std\n",
"0 filter 0.004742 1.040453e-04\n",
"1 tuple comprehension 0.003856 6.223687e-06\n",
"2 list comprehension 0.003672 1.044400e-06\n",
"3 numpy 0.000332 1.827475e-07"
]
},
"execution_count": 153,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = []\n",
"for sim, result in sim_result.items():\n",
" data.append(\n",
" {\n",
" \"name\": sim,\n",
" \"mean\": np.mean(np.array(result.all_runs) / result.loops),\n",
" \"std\": np.std(np.array(result.all_runs) / result.loops),\n",
" }\n",
" )\n",
"data = pd.DataFrame(data)\n",
"data"
]
},
{
"cell_type": "code",
"execution_count": 154,
"id": "06fa5cdd-23aa-47a0-ab26-a127a3f856a4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Text(0.5, 0, 'algorithm'), Text(0, 0.5, 'Execution Time (sec)')]"
]
},
"execution_count": 154,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 800x300 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, ax = plt.subplots(figsize=(8, 3))\n",
"ax.bar(data[\"name\"], data[\"mean\"], yerr=data[\"std\"], ecolor=\"black\", capsize=10)\n",
"ax.set(xlabel=\"algorithm\", ylabel=\"Execution Time (sec)\")"
]
},
{
"cell_type": "markdown",
"id": "10d6e0b4-9550-408c-a021-9a3381be8be6",
"metadata": {},
"source": [
"## 結果\n",
"\n",
"numpyが圧倒的に早かった(リスト内包と比較しても10倍以上)。list演算系ではリスト内包の書き方が最も早かった。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment