Skip to content

Instantly share code, notes, and snippets.

@gyu-don
Created February 21, 2024 15:58
Show Gist options
  • Save gyu-don/2a752661d5e2ce0e78cd99db169055f4 to your computer and use it in GitHub Desktop.
Save gyu-don/2a752661d5e2ce0e78cd99db169055f4 to your computer and use it in GitHub Desktop.
1週間で学べるJulia数値計算プログラミング 7.2〜7.3
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "1eb65508-53c8-4b02-9c8e-079c517ee2d3",
"metadata": {},
"source": [
"memo: パフォーマンスを気にするなら、こちらも見た方がいい(内容としてはかぶる部分が多い) https://docs.julialang.org/en/v1/manual/performance-tips/"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e810d334-9dec-468e-a91f-7295cb95513f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test1 (generic function with 1 method)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function test1()\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3b4ec3c0-a721-4958-82d4-ecdcbe7a4e98",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test1()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "0b507a2d-594e-4158-b84f-b4799e902039",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.000014 seconds\n"
]
},
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@time test1()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0731ff00-8931-4ce5-86c3-709cd34b192d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.000016 seconds\n"
]
},
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@time test1()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6db4d9c2-37fa-44fa-9099-d5b5ec1aeb3e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.000017 seconds\n"
]
},
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@time test1()"
]
},
{
"cell_type": "markdown",
"id": "0bfda5cd-3b2a-4f39-bcd0-e065022ebae4",
"metadata": {},
"source": [
"# BenchmarkTools\n",
"https://juliaci.github.io/BenchmarkTools.jl/dev/manual/"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "33d2de8f-6214-477e-9eca-c0edab94604a",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m registry at `~/.julia/registries/General.toml`\n",
"\u001b[32m\u001b[1m Resolving\u001b[22m\u001b[39m package versions...\n",
"\u001b[32m\u001b[1m Installed\u001b[22m\u001b[39m BenchmarkTools ─ v1.4.0\n",
"\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m `~/.julia/environments/v1.9/Project.toml`\n",
" \u001b[90m[6e4b80f9] \u001b[39m\u001b[92m+ BenchmarkTools v1.4.0\u001b[39m\n",
"\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m `~/.julia/environments/v1.9/Manifest.toml`\n",
" \u001b[90m[6e4b80f9] \u001b[39m\u001b[92m+ BenchmarkTools v1.4.0\u001b[39m\n",
" \u001b[90m[9abbd945] \u001b[39m\u001b[92m+ Profile\u001b[39m\n",
"\u001b[32m\u001b[1mPrecompiling\u001b[22m\u001b[39m project...\n",
"\u001b[32m ✓ \u001b[39mBenchmarkTools\n",
" 1 dependency successfully precompiled in 2 seconds. 332 already precompiled.\n"
]
}
],
"source": [
"using Pkg\n",
"Pkg.add(\"BenchmarkTools\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f861cf3f-6b17-47e1-82fb-e3e21bb420c1",
"metadata": {},
"outputs": [],
"source": [
"using BenchmarkTools"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "88fcb4f6-7b78-4511-ab7f-2a90a05de934",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 10.259 μs (0 allocations: 0 bytes)\n"
]
},
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@btime test1()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "407f583e-bc59-4d8f-96c8-a00ac6b0ddc5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 10.299 μs (0 allocations: 0 bytes)\n"
]
},
{
"data": {
"text/plain": [
"0.5379859612848843"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@btime test1()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "72a3c1e6-f03a-425b-a9a2-ee350b6ffbf5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m10.209 μs\u001b[22m\u001b[39m … \u001b[35m 28.804 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m10.490 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m10.559 μs\u001b[22m\u001b[39m ± \u001b[32m522.463 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m▄\u001b[39m▆\u001b[39m▆\u001b[34m█\u001b[39m\u001b[39m▃\u001b[32m▅\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\n",
" \u001b[39m▃\u001b[39m▃\u001b[39m█\u001b[39m█\u001b[39m▄\u001b[39m▅\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▃\u001b[39m▅\u001b[39m \u001b[39m█\n",
" 10.2 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 12 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test1()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "21c1b003-6457-4fb8-bece-a1b2e84874fe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test2 (generic function with 1 method)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"global t = Float64[1, 2, 3]\n",
"function test2()\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" t .+= s.*sin.(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "c84165e4-2fc2-4f92-a838-57c7027f5ec6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m404.353 μs\u001b[22m\u001b[39m … \u001b[35m 1.760 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 73.95%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m426.975 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m433.875 μs\u001b[22m\u001b[39m ± \u001b[32m60.472 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.59% ± 3.31%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m▆\u001b[34m▅\u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▆\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[32m▆\u001b[39m\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▃\n",
" 404 μs\u001b[90m Histogram: frequency by time\u001b[39m 497 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m93.75 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m3000\u001b[39m."
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test2()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f2b5d354-e967-4782-a345-fe332041c104",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test2! (generic function with 1 method)"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function test2!(t)\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" t .+= s.*sin.(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "66079c74-795f-4a69-a4db-72bd24067c7a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m28.214 μs\u001b[22m\u001b[39m … \u001b[35m85.952 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m29.275 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m30.176 μs\u001b[22m\u001b[39m ± \u001b[32m 3.283 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▁\u001b[39m▂\u001b[39m▅\u001b[34m█\u001b[39m\u001b[39m▃\u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m▇\u001b[32m▇\u001b[39m\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▆\u001b[39m▄\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m \u001b[39m█\n",
" 28.2 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 46.6 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"t2 = Float64[1, 2, 3]\n",
"@benchmark test2!(t2)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "bab4ec45-4318-44e6-8893-ae013218db6a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m27.822 μs\u001b[22m\u001b[39m … \u001b[35m62.427 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m28.745 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m29.478 μs\u001b[22m\u001b[39m ± \u001b[32m 2.777 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▁\u001b[39m \u001b[39m▅\u001b[34m█\u001b[39m\u001b[39m▃\u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m▅\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m▆\u001b[32m▅\u001b[39m\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m \u001b[39m█\n",
" 27.8 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 43.4 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"global t3 = Float64[1, 2, 3]\n",
"@benchmark test2!(t3)"
]
},
{
"cell_type": "markdown",
"id": "3696ecfe-7e7c-4348-bcb2-9bbb45ee9a4f",
"metadata": {},
"source": [
"Performance Tipsを見る感じは、グローバル変数自体よりも、型のついていないグローバル変数が問題らしい。型をつけてみる。"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "d389b564-ed94-4f9a-88b3-da3306ec8f6c",
"metadata": {},
"outputs": [
{
"ename": "LoadError",
"evalue": "cannot set type for global t. It already has a value or is already set to a different type.",
"output_type": "error",
"traceback": [
"cannot set type for global t. It already has a value or is already set to a different type.",
"",
"Stacktrace:",
" [1] top-level scope",
" @ In[46]:1"
]
}
],
"source": [
"global t :: Vector{Float64} = Float64[1, 2, 3]"
]
},
{
"cell_type": "markdown",
"id": "1e70e351-f9ff-4ef4-a270-e4380ddfd5e4",
"metadata": {},
"source": [
"再定義は出来ないらしいので、新たに作り直す。 `t` の代わりに `t4` を使った `test2_2()` も作る。"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "423f0117-f87f-443d-83b4-d294b3d2c218",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test2_2 (generic function with 1 method)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"global t4 :: Vector{Float64} = Float64[1, 2, 3]\n",
"function test2_2()\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" t4 .+= s.*sin.(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "eaa99df1-57fa-4041-b3e3-4cfa5ccddf91",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m28.314 μs\u001b[22m\u001b[39m … \u001b[35m51.598 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m29.415 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m30.175 μs\u001b[22m\u001b[39m ± \u001b[32m 2.661 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▁\u001b[39m \u001b[39m▃\u001b[39m▅\u001b[34m█\u001b[39m\u001b[39m▄\u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[32m▇\u001b[39m\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▇\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▃\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 28.3 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 43.2 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test2_2()"
]
},
{
"cell_type": "markdown",
"id": "07c84b37-6344-459c-b709-cbbf42f24f12",
"metadata": {},
"source": [
"結果として、ローカル変数を使ったものとほぼ変わらなくなった"
]
},
{
"cell_type": "markdown",
"id": "64657311-67bf-43ab-b632-6c5c63670303",
"metadata": {},
"source": [
"注意: `global t = Float64[1, 2, 3]` としていたが、これは `t` の値の型であり、変数自体の型とは別(変数の型はどうやって見るんだ?)。ローカル変数でも同様だが、関数をコンパイルするときに型が固定できるチャンスがある。"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "38d650f6-8be3-4b39-bcd2-bffc676d5189",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"a\""
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"t = \"a\" # これは可能"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "15c5f16e-e4af-454a-9233-b2760aed6713",
"metadata": {},
"outputs": [
{
"ename": "LoadError",
"evalue": "MethodError: \u001b[0mCannot `convert` an object of type \u001b[92mString\u001b[39m\u001b[0m to an object of type \u001b[91mVector{Float64}\u001b[39m\n\n\u001b[0mClosest candidates are:\n\u001b[0m convert(::Type{T}, \u001b[91m::LinearAlgebra.Factorization\u001b[39m) where T<:AbstractArray\n\u001b[0m\u001b[90m @\u001b[39m \u001b[32mLinearAlgebra\u001b[39m \u001b[90m~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/\u001b[39m\u001b[90m\u001b[4mfactorization.jl:59\u001b[24m\u001b[39m\n\u001b[0m convert(::Type{T}, \u001b[91m::AbstractArray\u001b[39m) where T<:Array\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4marray.jl:613\u001b[24m\u001b[39m\n\u001b[0m convert(::Type{T}, \u001b[91m::T\u001b[39m) where T<:AbstractArray\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4mabstractarray.jl:16\u001b[24m\u001b[39m\n\u001b[0m ...\n",
"output_type": "error",
"traceback": [
"MethodError: \u001b[0mCannot `convert` an object of type \u001b[92mString\u001b[39m\u001b[0m to an object of type \u001b[91mVector{Float64}\u001b[39m\n\n\u001b[0mClosest candidates are:\n\u001b[0m convert(::Type{T}, \u001b[91m::LinearAlgebra.Factorization\u001b[39m) where T<:AbstractArray\n\u001b[0m\u001b[90m @\u001b[39m \u001b[32mLinearAlgebra\u001b[39m \u001b[90m~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/\u001b[39m\u001b[90m\u001b[4mfactorization.jl:59\u001b[24m\u001b[39m\n\u001b[0m convert(::Type{T}, \u001b[91m::AbstractArray\u001b[39m) where T<:Array\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4marray.jl:613\u001b[24m\u001b[39m\n\u001b[0m convert(::Type{T}, \u001b[91m::T\u001b[39m) where T<:AbstractArray\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4mabstractarray.jl:16\u001b[24m\u001b[39m\n\u001b[0m ...\n",
"",
"Stacktrace:",
" [1] top-level scope",
" @ In[59]:1"
]
}
],
"source": [
"t4 = \"a\" # これは型が合わない"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "e86e7c0d-d59c-4b2c-afea-a349e0627973",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test2_3 (generic function with 1 method)"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Functorを作らなくても、これでも出来るはず\n",
"function test2_3(t)\n",
" function test2_inner()\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" t .+= s.*sin.(i)\n",
" end\n",
" s\n",
" end\n",
" test2_inner\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "0fdf1145-01d3-4d68-816c-1ef869b35321",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m27.802 μs\u001b[22m\u001b[39m … \u001b[35m48.261 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m28.644 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m28.802 μs\u001b[22m\u001b[39m ± \u001b[32m 1.014 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▆\u001b[34m█\u001b[39m\u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m▇\u001b[32m▃\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▂\n",
" 27.8 μs\u001b[90m Histogram: frequency by time\u001b[39m 33.9 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = test2_3(Float64[1, 2, 3])\n",
"@benchmark data()"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "7b0f3cb9-5c6c-4eee-871a-93185abcc1af",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test2_4 (generic function with 1 method)"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# さらに略記\n",
"test2_4(t) = () -> begin\n",
" s = 0.0\n",
" for i in 1:1000\n",
" s += cos(i)\n",
" t .+= s.*sin.(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "58f204c0-a431-4974-ba12-ab71803ebcc4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m27.983 μs\u001b[22m\u001b[39m … \u001b[35m76.975 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m28.724 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m29.100 μs\u001b[22m\u001b[39m ± \u001b[32m 1.639 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m▅\u001b[39m█\u001b[34m▆\u001b[39m\u001b[39m▃\u001b[32m▁\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m▇\u001b[39m█\u001b[39m▆\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▃\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m \u001b[39m█\n",
" 28 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 37 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = test2_4(Float64[1, 2, 3])\n",
"@benchmark data()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "f012bc76-45e6-4ecf-bc0c-2ecd086c23cb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MyTypeParametric{Float64}([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 … 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"struct MyTypeAny\n",
" a\n",
"end\n",
"\n",
"struct MyTypeAbstract\n",
" a::Array{Real, 1}\n",
"end\n",
"\n",
"struct MyTypeConcrete\n",
" a::Array{Float64, 1}\n",
"end\n",
"\n",
"struct MyTypeParametric{T}\n",
" a::Array{T, 1}\n",
"end\n",
"\n",
"function test3(mytype)\n",
" n = 10000\n",
" for i in 1:n\n",
" mytype.a[i] = cos(i * mytype.a[i])\n",
" end\n",
" sum(mytype.a)\n",
"end\n",
"\n",
"n = 10000\n",
"a = ones(Float64, n)\n",
"\n",
"myany = MyTypeAny(a)\n",
"myabs = MyTypeAbstract(a)\n",
"mycon = MyTypeConcrete(a)\n",
"mypar = MyTypeParametric(a)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "d48474b9-02db-412a-9e48-3d290001b1ab",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 5616 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m796.081 μs\u001b[22m\u001b[39m … \u001b[35m 2.956 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 67.05%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m855.368 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m887.481 μs\u001b[22m\u001b[39m ± \u001b[32m173.916 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m2.65% ± 8.05%\n",
"\n",
" \u001b[39m▁\u001b[39m▆\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m▆\u001b[32m▃\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▄\u001b[39m▅\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m \u001b[39m█\n",
" 796 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 1.98 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m913.56 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m58468\u001b[39m."
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test3(myany)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "9b0bab23-d10b-421a-a22f-f8ac69c5a40a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 5526 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m798.596 μs\u001b[22m\u001b[39m … \u001b[35m 3.664 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 73.56%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m862.376 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m902.086 μs\u001b[22m\u001b[39m ± \u001b[32m214.735 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m3.20% ± 8.60%\n",
"\n",
" \u001b[39m▁\u001b[39m▅\u001b[39m█\u001b[34m▆\u001b[39m\u001b[32m▃\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▃\u001b[39m▄\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 799 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 2.25 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m921.52 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m58977\u001b[39m."
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test3(myabs)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "edf23bad-a82d-4a4f-8faa-920e230dd967",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m131.638 μs\u001b[22m\u001b[39m … \u001b[35m200.088 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m137.549 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m138.168 μs\u001b[22m\u001b[39m ± \u001b[32m 2.705 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▃\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m▇\u001b[32m▇\u001b[39m\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▃\n",
" \u001b[39m▃\u001b[39m▁\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 132 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 150 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test3(mycon)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "d9acd871-e3d3-427f-8b99-5485e3f9a701",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m133.782 μs\u001b[22m\u001b[39m … \u001b[35m212.591 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m137.459 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m138.294 μs\u001b[22m\u001b[39m ± \u001b[32m 3.345 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▃\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m▇\u001b[39m\u001b[39m▇\u001b[32m▅\u001b[39m\u001b[39m▄\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\n",
" \u001b[39m▅\u001b[39m▄\u001b[39m▇\u001b[39m▅\u001b[39m▆\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▂\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m \u001b[39m█\n",
" 134 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 153 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark test3(mypar)"
]
},
{
"cell_type": "markdown",
"id": "ffa0fe56-4200-4f83-a349-45d1fdd72eb3",
"metadata": {},
"source": [
"### インライン展開\n",
"ある程度自動でやってくれる気がしたので試してみた。\n",
"- 普通の足し算\n",
"- 関数経由での足し算\n",
"- インライン関数経由での足し算\n",
"- 非インライン関数経由での足し算\n",
"\n",
"非インラインだけ遅かった"
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "da921a78-16f5-4f26-bbde-1c18b1f8f525",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"f4 (generic function with 1 method)"
]
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"add(x, y) = x + y\n",
"\n",
"function f1(n)\n",
" t = 0.0\n",
" for i in 1:n\n",
" t = t + i\n",
" end\n",
" t\n",
"end\n",
"\n",
"function f2(n)\n",
" t = 0.0\n",
" for i in 1:n\n",
" t = add(t, i)\n",
" end\n",
" t\n",
"end\n",
"\n",
"@inline inadd(x, y) = x + y\n",
"\n",
"function f3(n)\n",
" t = 0.0\n",
" for i in 1:n\n",
" t = inadd(t, i)\n",
" end\n",
" t\n",
"end\n",
"\n",
"@noinline noinadd(x, y) = x + y\n",
"\n",
"function f4(n)\n",
" t = 0.0\n",
" for i in 1:n\n",
" t = noinadd(t, i)\n",
" end\n",
" t\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "da9afe70-c338-4e58-a969-b8be0733be0e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 7110 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m690.975 μs\u001b[22m\u001b[39m … \u001b[35m788.631 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m699.783 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m701.904 μs\u001b[22m\u001b[39m ± \u001b[32m 5.241 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▂\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▅\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[39m▅\u001b[34m▃\u001b[39m\u001b[39m▃\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[32m▁\u001b[39m\u001b[39m▃\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m▃\u001b[39m▄\u001b[39m▂\u001b[39m▂\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▅\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▇\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 691 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 717 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark f1(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "c3700ec1-3e53-460d-b5e3-f30e92ec762c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 7204 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m679.274 μs\u001b[22m\u001b[39m … \u001b[35m744.557 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m690.987 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m692.899 μs\u001b[22m\u001b[39m ± \u001b[32m 5.477 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▇\u001b[39m▄\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m▃\u001b[34m█\u001b[39m\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▂\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\n",
" \u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[39m▃\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 679 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 710 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark f2(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "2bc065ff-8676-4125-985f-38a7bcdf94e3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 7204 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m683.121 μs\u001b[22m\u001b[39m … \u001b[35m747.903 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m691.231 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m692.852 μs\u001b[22m\u001b[39m ± \u001b[32m 5.007 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▆\u001b[39m▃\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▂\u001b[39m█\u001b[34m▅\u001b[39m\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m▅\u001b[39m▅\u001b[39m▂\u001b[39m▄\u001b[39m▂\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▅\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m \u001b[39m█\n",
" 683 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 708 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark f3(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "39f3452c-6669-4c55-9641-43dad8d8f210",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 3290 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.139 ms\u001b[22m\u001b[39m … \u001b[35m 2.345 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.635 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.518 ms\u001b[22m\u001b[39m ± \u001b[32m215.692 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m▇\u001b[39m▅\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▃\u001b[39m▇\u001b[34m█\u001b[39m\u001b[39m▇\u001b[39m▄\u001b[39m▃\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▁\u001b[39m▄\u001b[39m▄\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m \u001b[39m█\n",
" 1.14 ms\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 1.73 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 86,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark f4(1000000)"
]
},
{
"cell_type": "markdown",
"id": "596b20ae-b56c-43c7-9fba-44af05c7311a",
"metadata": {},
"source": [
"### `@code_warntype`\n",
"自分の環境では、パッケージを入れなくても入っていた"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "8e2dc87b-6b8a-4f97-8d09-f51da2e06179",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MethodInstance for test2()\n",
" from test2()\u001b[90m @\u001b[39m \u001b[90mMain\u001b[39m \u001b[90m\u001b[4mIn[27]:2\u001b[24m\u001b[39m\n",
"Arguments\n",
" #self#\u001b[36m::Core.Const(test2)\u001b[39m\n",
"Locals\n",
" @_2\u001b[33m\u001b[1m::Union{Nothing, Tuple{Int64, Int64}}\u001b[22m\u001b[39m\n",
" s\u001b[36m::Float64\u001b[39m\n",
" i\u001b[36m::Int64\u001b[39m\n",
"Body\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m1 ─\u001b[39m (s = 0.0)\n",
"\u001b[90m│ \u001b[39m %2 = (1:1000)\u001b[36m::Core.Const(1:1000)\u001b[39m\n",
"\u001b[90m│ \u001b[39m (@_2 = Base.iterate(%2))\n",
"\u001b[90m│ \u001b[39m %4 = (@_2::Core.Const((1, 1)) === nothing)\u001b[36m::Core.Const(false)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %5 = Base.not_int(%4)\u001b[36m::Core.Const(true)\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %5\n",
"\u001b[90m2 ┄\u001b[39m %7 = @_2\u001b[36m::Tuple{Int64, Int64}\u001b[39m\n",
"\u001b[90m│ \u001b[39m (i = Core.getfield(%7, 1))\n",
"\u001b[90m│ \u001b[39m %9 = Core.getfield(%7, 2)\u001b[36m::Int64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %10 = s\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %11 = Main.cos(i)\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m (s = %10 + %11)\n",
"\u001b[90m│ \u001b[39m %13 = Main.t\u001b[91m\u001b[1m::Any\u001b[22m\u001b[39m\n",
"\u001b[90m│ \u001b[39m %14 = Main.:+\u001b[36m::Core.Const(+)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %15 = Main.t\u001b[91m\u001b[1m::Any\u001b[22m\u001b[39m\n",
"\u001b[90m│ \u001b[39m %16 = Main.:*\u001b[36m::Core.Const(*)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %17 = s\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %18 = Base.broadcasted(Main.sin, i)\u001b[36m::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(sin), Tuple{Int64}}\u001b[39m\n",
"\u001b[90m│ \u001b[39m %19 = Base.broadcasted(%16, %17, %18)\u001b[36m::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(*), Tuple{Float64, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(sin), Tuple{Int64}}}}\u001b[39m\n",
"\u001b[90m│ \u001b[39m %20 = Base.broadcasted(%14, %15, %19)\u001b[91m\u001b[1m::Any\u001b[22m\u001b[39m\n",
"\u001b[90m│ \u001b[39m Base.materialize!(%13, %20)\n",
"\u001b[90m│ \u001b[39m (@_2 = Base.iterate(%2, %9))\n",
"\u001b[90m│ \u001b[39m %23 = (@_2 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %24 = Base.not_int(%23)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %24\n",
"\u001b[90m3 ─\u001b[39m goto #2\n",
"\u001b[90m4 ┄\u001b[39m return s\n",
"\n"
]
}
],
"source": [
"@code_warntype test2()"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "a96929c5-3149-4380-80c5-52fdfd5be0e8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MethodInstance for test2!(::Vector{Float64})\n",
" from test2!(\u001b[90mt\u001b[39m)\u001b[90m @\u001b[39m \u001b[90mMain\u001b[39m \u001b[90m\u001b[4mIn[29]:1\u001b[24m\u001b[39m\n",
"Arguments\n",
" #self#\u001b[36m::Core.Const(test2!)\u001b[39m\n",
" t\u001b[36m::Vector{Float64}\u001b[39m\n",
"Locals\n",
" @_3\u001b[33m\u001b[1m::Union{Nothing, Tuple{Int64, Int64}}\u001b[22m\u001b[39m\n",
" s\u001b[36m::Float64\u001b[39m\n",
" i\u001b[36m::Int64\u001b[39m\n",
"Body\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m1 ─\u001b[39m (s = 0.0)\n",
"\u001b[90m│ \u001b[39m %2 = (1:1000)\u001b[36m::Core.Const(1:1000)\u001b[39m\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2))\n",
"\u001b[90m│ \u001b[39m %4 = (@_3::Core.Const((1, 1)) === nothing)\u001b[36m::Core.Const(false)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %5 = Base.not_int(%4)\u001b[36m::Core.Const(true)\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %5\n",
"\u001b[90m2 ┄\u001b[39m %7 = @_3\u001b[36m::Tuple{Int64, Int64}\u001b[39m\n",
"\u001b[90m│ \u001b[39m (i = Core.getfield(%7, 1))\n",
"\u001b[90m│ \u001b[39m %9 = Core.getfield(%7, 2)\u001b[36m::Int64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %10 = s\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %11 = Main.cos(i)\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m (s = %10 + %11)\n",
"\u001b[90m│ \u001b[39m %13 = Main.:+\u001b[36m::Core.Const(+)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %14 = Main.:*\u001b[36m::Core.Const(*)\u001b[39m\n",
"\u001b[90m│ \u001b[39m %15 = s\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %16 = Base.broadcasted(Main.sin, i)\u001b[36m::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(sin), Tuple{Int64}}\u001b[39m\n",
"\u001b[90m│ \u001b[39m %17 = Base.broadcasted(%14, %15, %16)\u001b[36m::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(*), Tuple{Float64, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(sin), Tuple{Int64}}}}\u001b[39m\n",
"\u001b[90m│ \u001b[39m %18 = Base.broadcasted(%13, t, %17)\u001b[36m::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Vector{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(*), Tuple{Float64, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(sin), Tuple{Int64}}}}}}\u001b[39m\n",
"\u001b[90m│ \u001b[39m Base.materialize!(t, %18)\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2, %9))\n",
"\u001b[90m│ \u001b[39m %21 = (@_3 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %22 = Base.not_int(%21)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %22\n",
"\u001b[90m3 ─\u001b[39m goto #2\n",
"\u001b[90m4 ┄\u001b[39m return s\n",
"\n"
]
}
],
"source": [
"@code_warntype test2!(t2)"
]
},
{
"cell_type": "markdown",
"id": "5d9bdf4c-4984-4cf1-a79d-8e791049424c",
"metadata": {},
"source": [
"### 7.2.7.1\n",
"これはコンパイラがなんとかできない?"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "efae001a-93dd-42c4-9840-b106b03ff0fc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"testfloat (generic function with 1 method)"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function testint(n)\n",
" x = 1\n",
" for i in 1:n\n",
" x += 1 / i\n",
" end\n",
" x\n",
"end\n",
"\n",
"function testfloat(n)\n",
" x = 1.0\n",
" for i in 1:n\n",
" x += 1 / i\n",
" end\n",
" x\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "2b6a2a49-5ead-43ac-99d1-5183df6ea414",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 4269 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.152 ms\u001b[22m\u001b[39m … \u001b[35m1.275 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.167 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.170 ms\u001b[22m\u001b[39m ± \u001b[32m9.985 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▆\u001b[39m▁\u001b[39m \u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m█\u001b[39m▅\u001b[34m▄\u001b[39m\u001b[39m▄\u001b[39m▅\u001b[32m▆\u001b[39m\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\n",
" \u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▇\u001b[39m▃\u001b[39m▄\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m \u001b[39m█\n",
" 1.15 ms\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 1.21 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark testint(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "2e09a66d-d998-488d-9b4e-fa67fd2515be",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 4263 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.152 ms\u001b[22m\u001b[39m … \u001b[35m 1.496 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.170 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.171 ms\u001b[22m\u001b[39m ± \u001b[32m11.684 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[34m \u001b[39m\u001b[32m▂\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m█\u001b[39m▄\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m█\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[34m▇\u001b[39m\u001b[32m█\u001b[39m\u001b[39m▆\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▃\n",
" 1.15 ms\u001b[90m Histogram: frequency by time\u001b[39m 1.21 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark testfloat(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 124,
"id": "653e145f-ed26-41a1-980c-ec04042f3eb1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MethodInstance for testint(::Int64)\n",
" from testint(\u001b[90mn\u001b[39m)\u001b[90m @\u001b[39m \u001b[90mMain\u001b[39m \u001b[90m\u001b[4mIn[89]:1\u001b[24m\u001b[39m\n",
"Arguments\n",
" #self#\u001b[36m::Core.Const(testint)\u001b[39m\n",
" n\u001b[36m::Int64\u001b[39m\n",
"Locals\n",
" @_3\u001b[33m\u001b[1m::Union{Nothing, Tuple{Int64, Int64}}\u001b[22m\u001b[39m\n",
" x\u001b[33m\u001b[1m::Union{Float64, Int64}\u001b[22m\u001b[39m\n",
" i\u001b[36m::Int64\u001b[39m\n",
"Body\u001b[33m\u001b[1m::Union{Float64, Int64}\u001b[22m\u001b[39m\n",
"\u001b[90m1 ─\u001b[39m (x = 1)\n",
"\u001b[90m│ \u001b[39m %2 = (1:n)\u001b[36m::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])\u001b[39m\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2))\n",
"\u001b[90m│ \u001b[39m %4 = (@_3 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %5 = Base.not_int(%4)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %5\n",
"\u001b[90m2 ┄\u001b[39m %7 = @_3\u001b[36m::Tuple{Int64, Int64}\u001b[39m\n",
"\u001b[90m│ \u001b[39m (i = Core.getfield(%7, 1))\n",
"\u001b[90m│ \u001b[39m %9 = Core.getfield(%7, 2)\u001b[36m::Int64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %10 = x\u001b[33m\u001b[1m::Union{Float64, Int64}\u001b[22m\u001b[39m\n",
"\u001b[90m│ \u001b[39m %11 = (1 / i)\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m (x = %10 + %11)\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2, %9))\n",
"\u001b[90m│ \u001b[39m %14 = (@_3 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %15 = Base.not_int(%14)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %15\n",
"\u001b[90m3 ─\u001b[39m goto #2\n",
"\u001b[90m4 ┄\u001b[39m return x\n",
"\n"
]
}
],
"source": [
"@code_warntype testint(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 125,
"id": "612ac54c-e555-414e-bc29-935d4a2f04da",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MethodInstance for testfloat(::Int64)\n",
" from testfloat(\u001b[90mn\u001b[39m)\u001b[90m @\u001b[39m \u001b[90mMain\u001b[39m \u001b[90m\u001b[4mIn[89]:9\u001b[24m\u001b[39m\n",
"Arguments\n",
" #self#\u001b[36m::Core.Const(testfloat)\u001b[39m\n",
" n\u001b[36m::Int64\u001b[39m\n",
"Locals\n",
" @_3\u001b[33m\u001b[1m::Union{Nothing, Tuple{Int64, Int64}}\u001b[22m\u001b[39m\n",
" x\u001b[36m::Float64\u001b[39m\n",
" i\u001b[36m::Int64\u001b[39m\n",
"Body\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m1 ─\u001b[39m (x = 1.0)\n",
"\u001b[90m│ \u001b[39m %2 = (1:n)\u001b[36m::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])\u001b[39m\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2))\n",
"\u001b[90m│ \u001b[39m %4 = (@_3 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %5 = Base.not_int(%4)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %5\n",
"\u001b[90m2 ┄\u001b[39m %7 = @_3\u001b[36m::Tuple{Int64, Int64}\u001b[39m\n",
"\u001b[90m│ \u001b[39m (i = Core.getfield(%7, 1))\n",
"\u001b[90m│ \u001b[39m %9 = Core.getfield(%7, 2)\u001b[36m::Int64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %10 = x\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m %11 = (1 / i)\u001b[36m::Float64\u001b[39m\n",
"\u001b[90m│ \u001b[39m (x = %10 + %11)\n",
"\u001b[90m│ \u001b[39m (@_3 = Base.iterate(%2, %9))\n",
"\u001b[90m│ \u001b[39m %14 = (@_3 === nothing)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m│ \u001b[39m %15 = Base.not_int(%14)\u001b[36m::Bool\u001b[39m\n",
"\u001b[90m└──\u001b[39m goto #4 if not %15\n",
"\u001b[90m3 ─\u001b[39m goto #2\n",
"\u001b[90m4 ┄\u001b[39m return x\n",
"\n"
]
}
],
"source": [
"@code_warntype testfloat(1000000)"
]
},
{
"cell_type": "code",
"execution_count": 171,
"id": "5dc230da-2c02-47a8-b3ff-8e157486164a",
"metadata": {},
"outputs": [],
"source": [
"struct MyAddable2\n",
" val\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 175,
"id": "9eb08b0c-3719-496e-9283-d60a65e59fb3",
"metadata": {},
"outputs": [],
"source": [
"Base.:+(x::MyAddable2, y) = MyAddable2(x.val + y + 1)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "4ba382a6-df1a-4d3c-8190-6fde362bbff9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"testslow (generic function with 2 methods)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function testslow(n, t=true)\n",
" if t\n",
" x = 1\n",
" else\n",
" x = MyAddable2(1)\n",
" end\n",
" for i in 1:n\n",
" x += 1 / i\n",
" end\n",
" x\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "70699d2c-9e9f-45ae-9d74-074e633561e2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 4247 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.159 ms\u001b[22m\u001b[39m … \u001b[35m1.270 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.174 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.176 ms\u001b[22m\u001b[39m ± \u001b[32m7.737 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[39m \u001b[39m \u001b[34m \u001b[39m\u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▇\u001b[39m▅\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m█\u001b[39m▅\u001b[39m▄\u001b[34m▃\u001b[39m\u001b[39m▃\u001b[39m▄\u001b[32m▄\u001b[39m\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▃\n",
" 1.16 ms\u001b[90m Histogram: frequency by time\u001b[39m 1.2 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark testslow(1000000)"
]
},
{
"cell_type": "markdown",
"id": "36daef9a-1fa0-4404-afef-95941b7e05aa",
"metadata": {},
"source": [
"一般論として、変数の型が固定されていないことで計算が遅くなりうることには同意するが、今回、遅くすることができなかった。\n",
"\n",
"このあたりも参照。環境などによっても変わりそう。 \n",
"https://twitter.com/bicycle1885/status/1759585879839171011 \n",
"https://twitter.com/antimon2/status/1759783344387043610"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "6b20eedf-ff89-4d5e-8a75-d1adddfb8cf5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"julia version 1.9.4\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n"
]
},
{
"data": {
"text/plain": [
"Process(`\u001b[4mjulia\u001b[24m \u001b[4m--version\u001b[24m`, ProcessExited(0))"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"run(`julia --version`)"
]
},
{
"cell_type": "markdown",
"id": "ed4c3168-02fb-4ff9-8a9b-47c896f53d81",
"metadata": {},
"source": [
"#### 7.2.8 メモリのアクセス順序\n",
"これは実はかなり変わる"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "0dc05494-d3d8-4aff-b12a-00e1b60474fa",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"sum_ji (generic function with 1 method)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function sum_ij(a, n)\n",
" s = 0\n",
" for i in 1:n\n",
" for j in 1:n\n",
" s += a[i, j]\n",
" end\n",
" end\n",
" s\n",
"end\n",
"\n",
"function sum_ji(a, n)\n",
" s = 0\n",
" for j in 1:n\n",
" for i in 1:n\n",
" s += a[i, j]\n",
" end\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "07f0a641-052f-4c6f-af4c-e8ade10fd449",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5000×5000 Matrix{Float64}:\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" ⋮ ⋮ ⋱ ⋮ \n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
" 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"n = 5000\n",
"a = ones(n, n)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "0a967639-974a-4296-ba35-1a9cbe38384e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 133 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m35.505 ms\u001b[22m\u001b[39m … \u001b[35m47.957 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m36.706 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m37.676 ms\u001b[22m\u001b[39m ± \u001b[32m 2.283 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▂\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m▃\u001b[39m \u001b[39m▅\u001b[34m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▅\u001b[39m█\u001b[34m▅\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m▄\u001b[39m▆\u001b[39m▅\u001b[32m▄\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▄\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▅\u001b[39m▃\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m \u001b[39m▃\n",
" 35.5 ms\u001b[90m Histogram: frequency by time\u001b[39m 45 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sum_ij(a, n)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "b491b5d2-7f9e-4ce2-81c2-0ad5f48b1e37",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 255 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m18.924 ms\u001b[22m\u001b[39m … \u001b[35m 23.142 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m19.203 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m19.604 ms\u001b[22m\u001b[39m ± \u001b[32m886.761 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m▂\u001b[39m▃\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[34m▅\u001b[39m\u001b[39m▂\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m▆\u001b[39m\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▁\u001b[39m▄\u001b[39m▄\u001b[39m▁\u001b[39m▆\u001b[39m▄\u001b[39m▇\u001b[39m▆\u001b[39m▄\u001b[39m▁\u001b[39m▆\u001b[39m▄\u001b[39m▄\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▁\u001b[39m▄\u001b[39m▁\u001b[39m▁\u001b[39m▆\u001b[39m▁\u001b[39m█\u001b[39m▄\u001b[39m▁\u001b[39m▁\u001b[39m▆\u001b[39m▁\u001b[39m▇\u001b[39m▇\u001b[39m▁\u001b[39m█\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▁\u001b[39m▁\u001b[39m▄\u001b[39m▁\u001b[39m▄\u001b[39m▄\u001b[39m \u001b[39m▆\n",
" 18.9 ms\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 22.4 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m16 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sum_ji(a, n)"
]
},
{
"cell_type": "markdown",
"id": "20bdcb0c-9f49-438c-b380-256ab903f06c",
"metadata": {},
"source": [
"スライスの話→それはそうなので省略"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7dece5d0-7316-4fca-9a5d-5f9df447fec9",
"metadata": {},
"source": [
"# 並列計算\n",
"- https://www.r-ccs.riken.jp/outreach/schools/20230413-1/\n",
"## 数値計算における並列計算\n",
"[フリンの分類](https://ja.wikipedia.org/wiki/%E3%83%95%E3%83%AA%E3%83%B3%E3%81%AE%E5%88%86%E9%A1%9E)\n",
"- SISD: Single Instruction, Single Data\n",
" - 並列ではないもの\n",
"- SIMD: Single Instruction, Multiple Data\n",
" - 異なるデータそれぞれに対し、同じ命令を実行する\n",
" - CPU命令の\"SIMD命令\"を指す意味でも使われるが、ここでは、もう少し広く、異なるデータに対して同じ関数や同じプログラムを動かすものもSIMDと呼ぶことにする\n",
"- (MISD: Multiple Instruction, Single Data)\n",
" - あまり使われない\n",
"- MIMD: Multiple Instruction, Multiple Data\n",
" - 異なるデータそれぞれに対し、異なる命令を実行する\n",
" - 汎用CPUでは、MIMDを行うことができる\n",
" \n",
"CPUは本来はMIMDだが、数値計算での並列計算では**SIMDを意識する**(複数の異なるプログラムを高速に動かすことは常人には難しい)\n",
"\n",
"## アムダールの法則\n",
"- https://docs.oracle.com/cd/E19205-01/820-1209/bjaem/index.html\n",
"- http://www.damp.tottori-u.ac.jp/~hoshi///info/slide_hoshi_20150422_suzukake_02_HPC_lt.pdf\n",
"\n",
"タスクをP個のプロセッサで解いたとき、かかる時間が1/Pになるのが理想だが、それは現実的に無理。\n",
"\n",
"単純化して、タスクを並列計算が不可能な箇所、可能な箇所に分ける。\n",
"\n",
"- 並列計算が不可能な部分→計算時間が変わらない(1倍)\n",
"- 並列計算が可能な部分→計算時間が(1/P)倍になる\n",
"\n",
"このとき、1プロセッサでの全計算時間が1のタスクがどれくらいで終わるかを考える。Fを並列計算が不可能な割合とおくと、並列実行時の計算時間は\n",
"\n",
"1/S := F + (1 - F)/P\n",
"\n",
"となる。(並列度Sを、並列実行時の計算時間の逆数とおいた)\n",
"\n",
"並列計算が不可能な部分が1/nあると、プロセッサが無限にあっても、並列度はnにしかならない(n倍速にしかならない)ことが分かる。\n",
"\n",
"### strong scaling vs weak scaling\n",
"\n",
"- strong scaling: 問題のサイズを固定して、並列度を上げる\n",
"- weak scaling: 問題のサイズが可変(プロセッサ数に比例)で、並列度を上げる\n",
"\n",
"strong scalingの方が難しい(並列度が上がりにくい)。\n",
"\n",
"https://twitter.com/Hishinuma_t/status/1643757143588470784\n",
"> 経験上、人月の話も並列計算の話もタスク分割と並列化の話なので100マス計算を例に出すと説明しやすくて、 \n",
"「100マス計算の紙があって、1マス1秒、計100秒で計算できます。では100人で取り組めば1秒で計算できそうですか?紙は1枚だけです」 \n",
"と言うと多くの人にわかってもらえる\n",
"\n",
" \n",
"## まずはバカパラできないか考える\n",
"https://kaityo256.github.io/sevendayshpc/day3/index.html\n",
"> 例えば、100個の画像データがあるが、それらを全部リサイズしたい、といったタスクを考える。 それぞれのタスクには依存関係が全くないので、全部同時に実行してもなんの問題もない。 したがって、100並列で実行すれば100倍早くなる。 このように、並列タスク間で依存関係や情報のやりとりが発生しない並列化のことを自明並列と呼ぶ。 英語では、Trivial Parallelization(自明並列)とか、Embarrassingly parallel(馬鹿パラ)などと表現される。 「馬鹿パラ」とは「馬鹿でもできる並列化」の略で(諸説あり)、その名の通り簡単に並列化できるため、 文字通り馬鹿にされることも多いのだが、並列化効率が100%であり、最も効率的に計算資源を利用していることになるため、 その意義は大きい。 なにはなくとも、まず馬鹿パラができないことには非自明並列もできないわけだし、馬鹿パラができるだけでも、できない人に比べて 圧倒的な攻撃力を持つことになる。\n",
"\n",
"バカパラはプログラムが簡単で効率もいい。並列計算をするなら、まずは、バカパラが出来ないかを考えると良い。 \n",
"1週間Julia本でも、バカパラしか出てきていないように見える。\n",
"\n",
"## multi-threading vs multi-processing\n",
"プロセスとスレッドは1:Nの関係\n",
"\n",
"\n",
"- プロセス間ではメモリ空間が分かれる\n",
" - データの共有はプロセス間通信(数値計算ではMPI通信、一般用途ではgRPCやらRESTやらも)で行う\n",
" - オーバーヘッドが大きい\n",
"- スレッド間ではメモリ空間を共有する\n",
" - データの共有が行いやすいが、データ競合には注意が必要\n",
" - メモリは同じ空間を共有しているが、複数のスレッドが同じ箇所を参照することで不整合が生じる\n",
" - ロックや同期、アトミック演算を行うことでデータ競合が防げる(オーバーヘッドがある)\n",
"\n",
"\n",
"https://hpc-tutorials.llnl.gov/posix/what_is_a_thread/\n",
"#### プロセス\n",
"![](https://hpc-tutorials.llnl.gov/posix/images/process.gif)\n",
"\n",
"#### プロセスとスレッド\n",
"![](https://hpc-tutorials.llnl.gov/posix/images/thread.gif)\n",
"\n",
"### 使い分け\n",
"- プロセスよりもスレッドの方がオーバーヘッドが少ないため、通常はマルチスレッドの方が高速\n",
"- HPCは多数のノードから構成されており、それぞれのノードが個別のメモリを持っている。そのため、HPCで複数のノードに跨る計算を行うためにはマルチプロセスが必須\n",
" - マルチプロセスとマルチスレッドの両方を行うハイブリッド形式も用いられる\n",
"- 余談: Pythonの場合、データ競合防止のため、Pythonインタプリタは1プロセスあたり1スレッドしか同時に動かせない(GIL: Global Interpreter Lock)\n",
" - なのでPythonのマルチスレッドは、IO待ちに別の処理をするなどには有効であるが、重い計算を並列に動かすことには利用できない\n",
"\n",
"シングルノードのコンピュータでJuliaで動かす分には**マルチスレッドだけでいい**が、HPCなども念頭に置くならマルチプロセスも考える。"
]
},
{
"cell_type": "markdown",
"id": "9d321399-9380-41a5-8977-b1a9231ed133",
"metadata": {},
"source": [
"## Juliaでのマルチスレッド\n",
"公式ドキュメントも参照のこと \n",
"https://docs.julialang.org/en/v1/manual/multi-threading/#man-multithreading\n",
"\n",
"### Julia起動時の注意\n",
"マルチスレッド計算を行うには、環境変数 `JULIA_NUM_THREADS` を予めセットするか、 `julia -t 8` のようにオプションをつけてから、Juliaを起動する必要がある(起動後は変更できない) \n",
"スレッド数を数字の代わりに`auto`とすると、自動でスレッド数がセットされる。\n",
"\n",
"Jupyter Notebookを使う場合、[環境変数をセットしたカーネルを作る方法が紹介されている](https://github.com/JuliaLang/IJulia.jl/issues/882#issuecomment-579520246) が、私の環境では `JULIA_NUM_THREADS=16 jupyter-notebook` のようにJupyter Notebookを起動してもマルチスレッドになった(Jupyter Notebook v7.0.3, IJulia v1.24.2)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "14cb498d-6732-407a-b36e-c19855fba550",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"nthreads() = 16\n"
]
},
{
"data": {
"text/plain": [
"16"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using Base.Threads\n",
"\n",
"# 1になっていないことを確認\n",
"@show nthreads()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "7a64535f-8031-4661-b864-2531503bfd1a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"test5 (generic function with 1 method)"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function heavycalc(i)\n",
" c = 0.0\n",
" for j in 1:1000\n",
" c += cos(i*j)\n",
" end\n",
" c\n",
"end\n",
"\n",
"function test5()\n",
" n = 100\n",
" a = ones(Float64, n)\n",
" @threads for i in 1:n\n",
" a[i] = heavycalc(i)\n",
" end\n",
" sum(a)\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "8af870a6-ae1d-42c8-9acb-23c029f5f00a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-107.35290535906694"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test5()"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9174af2b-f514-4fa9-954b-40ec940ab6b0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m102.473 μs\u001b[22m\u001b[39m … \u001b[35m 3.266 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 91.26%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m119.401 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m147.877 μs\u001b[22m\u001b[39m ± \u001b[32m66.914 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.40% ± 1.29%\n",
"\n",
" \u001b[39m▄\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[34m▇\u001b[39m\u001b[39m▅\u001b[39m▃\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m▄\u001b[39m▆\u001b[39m▇\u001b[39m▆\u001b[39m▄\u001b[39m▃\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▅\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[39m▅\u001b[39m▅\u001b[39m▇\u001b[39m \u001b[39m█\n",
" 102 μs\u001b[90m \u001b[39m\u001b[90mHistogram: \u001b[39m\u001b[90m\u001b[1mlog(\u001b[22m\u001b[39m\u001b[90mfrequency\u001b[39m\u001b[90m\u001b[1m)\u001b[22m\u001b[39m\u001b[90m by time\u001b[39m 331 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m9.81 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m94\u001b[39m."
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using BenchmarkTools\n",
"@benchmark test5()"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "22a1d27b-3e01-45df-92c3-da7034f37331",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 3678 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.327 ms\u001b[22m\u001b[39m … \u001b[35m 1.474 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.355 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.358 ms\u001b[22m\u001b[39m ± \u001b[32m11.704 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[39m█\u001b[39m▃\u001b[39m▁\u001b[39m▆\u001b[39m▇\u001b[34m▄\u001b[39m\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▄\n",
" 1.33 ms\u001b[90m Histogram: frequency by time\u001b[39m 1.4 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m896 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function test5single()\n",
" n = 100\n",
" a = ones(Float64, n)\n",
" for i in 1:n\n",
" a[i] = heavycalc(i)\n",
" end\n",
" sum(a)\n",
"end\n",
"\n",
"@benchmark test5single()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "a0cc8d41-8d0a-44e9-b0f2-2a2f5ea1d44e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2, 2, 2, 2, 2, 2, 2, 11, 11, 11, 11, 11, 11, 11, 9, 9, 9, 9, 9, 9, 9, 6, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 10, 10, 10, 10, 10, 10, 7, 7, 7, 7, 7, 7, 12, 12, 12, 12, 12, 12, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 4, 4, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 3, 3, 3, 3, 3, 3, 15, 15, 15, 15, 15, 15, 1, 1, 1, 1, 1, 1, 16, 16, 16, 16, 16, 16]\n",
"[4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 8, 8, 8, 8, 8, 8, 8, 13, 13, 13, 13, 13, 13, 13, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 14, 14, 14, 14, 14, 14, 7, 7, 7, 7, 7, 7, 15, 15, 15, 15, 15, 15, 12, 12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11, 9, 9, 9, 9, 9, 9, 7, 7, 7, 7, 7, 7, 10, 10, 10, 10, 10, 10, 5, 5, 5, 5, 5, 5]\n"
]
}
],
"source": [
"# スレッド並列のイメージを見る。forループが分割されているのが分かる\n",
"function test5id!(a::Vector{Int64}, n)\n",
" @threads for i in 1:n\n",
" a[i] = threadid()\n",
" end\n",
"end\n",
"\n",
"n = 100\n",
"a = zeros(Int64, n)\n",
"test5id!(a, n)\n",
"println(a)\n",
"test5id!(a, n)\n",
"println(a)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "3cf23322-8569-46cb-b17e-d1e4917ec021",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"thread id = 5\ti = 1\n",
"thread id = 14\ti = 71\n",
"thread id = 4\ti = 15\n",
"thread id = 8\ti = 65\n",
"thread id = 7\ti = 41\n",
"thread id = 2\ti = 47\n",
"thread id = 15\ti = 29\n",
"thread id = 7\ti = 42\n",
"thread id = 16\ti = 77\n",
"thread id = 8\ti = 66\n",
"thread id = 15\ti = 78\n",
"thread id = 5\ti = 67\n",
"thread id = 9\ti = 35\n",
"thread id = 13\ti = 53\n",
"thread id = 10\ti = 48\n",
"thread id = 12\ti = 59\n",
"thread id = 2\ti = 54\n",
"thread id = 7\ti = 43\n",
"thread id = 9\ti = 36\n",
"thread id = 11\ti = 44\n",
"thread id = 5\ti = 2\n",
"thread id = 11\ti = 45\n",
"thread id = 3\ti = 83\n",
"thread id = 14\ti = 72\n",
"thread id = 1\ti = 89\n",
"thread id = 4\ti = 16\n",
"thread id = 9\ti = 3\n",
"thread id = 5\ti = 17\n",
"thread id = 11\ti = 22\n",
"thread id = 5\ti = 18\n",
"thread id = 11\ti = 73\n",
"thread id = 15\ti = 90\n",
"thread id = 6\ti = 95\n",
"thread id = 11\ti = 91\n",
"thread id = 9\ti = 96\n",
"thread id = 16\ti = 19\n",
"thread id = 10\ti = 8\n",
"thread id = 9\ti = 97\n",
"thread id = 5\ti = 74\n",
"thread id = 11\ti = 92\n",
"thread id = 9\ti = 98\n",
"thread id = 11\ti = 93\n",
"thread id = 7\ti = 60\n",
"thread id = 16\ti = 55\n",
"thread id = 9\ti = 37\n",
"thread id = 4\ti = 56\n",
"thread id = 2\ti = 61\n",
"thread id = 15\ti = 79\n",
"thread id = 9\ti = 4\n",
"thread id = 4\ti = 57\n",
"thread id = 13\ti = 30\n",
"thread id = 7\ti = 80\n",
"thread id = 5\ti = 68\n",
"thread id = 4\ti = 58\n",
"thread id = 8\ti = 5\n",
"thread id = 7\ti = 81\n",
"thread id = 8\ti = 6\n",
"thread id = 6\ti = 38\n",
"thread id = 12\ti = 69\n",
"thread id = 10\ti = 31\n",
"thread id = 6\ti = 39\n",
"thread id = 8\ti = 7\n",
"thread id = 14\ti = 23\n",
"thread id = 7\ti = 82\n",
"thread id = 6\ti = 40\n",
"thread id = 16\ti = 20\n",
"thread id = 10\ti = 32\n",
"thread id = 11\ti = 46\n",
"thread id = 10\ti = 33\n",
"thread id = 4\ti = 21\n",
"thread id = 13\ti = 24\n",
"thread id = 12\ti = 70\n",
"thread id = 9\ti = 99\n",
"thread id = 2\ti = 62\n",
"thread id = 15\ti = 9\n",
"thread id = 11\ti = 94\n",
"thread id = 16\ti = 63\n",
"thread id = 11\ti = 10\n",
"thread id = 15\ti = 100\n",
"thread id = 3\ti = 75\n",
"thread id = 10\ti = 49\n",
"thread id = 9\ti = 76\n",
"thread id = 11\ti = 11\n",
"thread id = 14\ti = 50\n",
"thread id = 16\ti = 64\n",
"thread id = 14\ti = 51\n",
"thread id = 10\ti = 34\n",
"thread id = 13\ti = 25\n",
"thread id = 13\ti = 26\n",
"thread id = 14\ti = 52\n",
"thread id = 13\ti = 27\n",
"thread id = 3\ti = 84\n",
"thread id = 13\ti = 28\n",
"thread id = 11\ti = 12\n",
"thread id = 11\ti = 13\n",
"thread id = 11\ti = 14\n",
"thread id = 5\ti = 85\n",
"thread id = 5\ti = 86\n",
"thread id = 5\ti = 87\n",
"thread id = 5\ti = 88\n"
]
}
],
"source": [
"function test5idprint()\n",
" n = 100\n",
" @threads for i in 1:n\n",
" println(\"thread id = $(threadid())\\ti = $(i)\")\n",
" end\n",
"end\n",
"\n",
"test5idprint()"
]
},
{
"cell_type": "markdown",
"id": "cf928c5f-e336-47ee-9a2b-0417e6cdcf20",
"metadata": {},
"source": [
"`@spawn` はどちらかといえばMIMD寄り。アプリケーション作りで、IO待ちを別スレッドに投げるとかにも使えそう。 \n",
"`fetch` は、終わるのを待つ処理。 \n",
"\n",
"公式ドキュメントの例の方がいいので、そっちを見る。"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "da7c3e09-5500-46be-9e5d-34a3af76b510",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"500000500000"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 普通のsum\n",
"function sum_single(a)\n",
" s = 0\n",
" for i in a\n",
" s += i\n",
" end\n",
" s\n",
"end\n",
"\n",
"sum_single(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "8aed861e-11c2-4031-ac99-af67d5aa0104",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1000 evaluations.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m2.083 ns\u001b[22m\u001b[39m … \u001b[35m9.518 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m2.114 ns \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m2.271 ns\u001b[22m\u001b[39m ± \u001b[32m0.354 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[34m█\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▅\u001b[34m█\u001b[39m\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▄\u001b[39m▄\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▂\n",
" 2.08 ns\u001b[90m Histogram: frequency by time\u001b[39m 3.51 ns \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sum_single(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "b0ba1348-5d95-47d7-b39d-d19cff7f390c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1000 evaluations.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m1.172 ns\u001b[22m\u001b[39m … \u001b[35m9.418 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m1.182 ns \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m1.188 ns\u001b[22m\u001b[39m ± \u001b[32m0.131 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[34m█\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▄\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[34m█\u001b[39m\u001b[39m▁\u001b[39m▆\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m█\u001b[39m▁\u001b[39m▄\u001b[39m \u001b[39m▂\n",
" 1.17 ns\u001b[90m Histogram: frequency by time\u001b[39m 1.19 ns \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 組み込みのsum\n",
"@benchmark sum(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "a2c3142f-01f9-41fc-8715-526444caabc3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"sum_multi_bad (generic function with 1 method)"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function sum_multi_bad(a)\n",
" s = 0\n",
" @threads for i in a\n",
" s += i\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "8f9f7460-df3d-48fe-9b82-1e5d8de0b872",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"sum_single\n",
"500000500000\n",
"\n",
"sum_multi_bad\n",
"68954367928\n",
"60145954713\n",
"67193912329\n",
"62212777306\n",
"65752202885\n",
"37115233950\n",
"13671920446\n",
"33218677902\n",
"9774802196\n",
"61677166069\n"
]
}
],
"source": [
"# データ競合により、答えが合わない。動かすたびに答えも変わる。\n",
"println(\"sum_single\")\n",
"println(sum_single(1:1_000_000))\n",
"println()\n",
"println(\"sum_multi_bad\")\n",
"for _ in 1:10\n",
" println(sum_multi_bad(1:1_000_000))\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "1d48f63c-c1ac-4a23-945e-7b823e1f1ec2",
"metadata": {},
"source": [
"各スレッドが `s` の値を自由に読み書きしているため、他スレッドによる `s` の更新に気づかずに、古い値に対する計算結果を書きに行くことが原因。 \n",
"「メモリから読む、計算をする、メモリに書く」という一連の操作を非可分(atomic)な操作とすることで問題が避けられる。"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "6ebc9759-76e5-4e98-a464-bf39a7f52b54",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"sum_multi_atomic (generic function with 1 method)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 変数をatomicにすると合うが、遅い\n",
"function sum_multi_atomic(a)\n",
" s = Atomic{Int}(0)\n",
" @threads for i in a\n",
" atomic_add!(s, i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "e0701143-f5ab-4a77-8370-c376e65c29da",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Atomic{Int64}(500000500000)"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum_multi_atomic(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "2c6f045c-32d0-4ce6-a35b-2f634802b4b1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 684 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m6.707 ms\u001b[22m\u001b[39m … \u001b[35m 9.568 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m7.284 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m7.307 ms\u001b[22m\u001b[39m ± \u001b[32m238.418 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[34m▇\u001b[39m\u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m█\u001b[34m█\u001b[39m\u001b[32m▇\u001b[39m\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m \u001b[39m▂\n",
" 6.71 ms\u001b[90m Histogram: frequency by time\u001b[39m 8.84 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m8.89 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m92\u001b[39m."
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sum_multi_atomic(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "17f5b912-c098-42ad-8804-8a056e1b0e8a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"sum_multi_good (generic function with 1 method)"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" function sum_multi_good(a)\n",
" chunks = Iterators.partition(a, length(a) ÷ Threads.nthreads())\n",
" tasks = map(chunks) do chunk\n",
" Threads.@spawn sum_single(chunk)\n",
" end\n",
" chunk_sums = fetch.(tasks)\n",
" return sum_single(chunk_sums)\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "db8e6836-bb61-4065-9c96-61dcd538a092",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"500000500000"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum_multi_good(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "619ee1d6-702f-4849-95ec-d41be294d61a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 10000 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m 8.767 μs\u001b[22m\u001b[39m … \u001b[35m 3.547 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 95.41%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m13.315 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m16.207 μs\u001b[22m\u001b[39m ± \u001b[32m49.852 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m4.14% ± 1.35%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▄\u001b[39m▆\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[39m▂\u001b[39m \u001b[39m \u001b[39m \u001b[34m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▁\u001b[39m▂\u001b[39m▄\u001b[39m▆\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▆\u001b[39m▄\u001b[34m▃\u001b[39m\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[32m▅\u001b[39m\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▃\n",
" 8.77 μs\u001b[90m Histogram: frequency by time\u001b[39m 28 μs \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m9.08 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m112\u001b[39m."
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sum_multi_good(1:1_000_000)"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "404013dd-e024-4460-9fd0-204ec68a5186",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"heavy_single (generic function with 1 method)"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 計算が単純すぎて高速化していないので、複雑にしてみる\n",
"function heavy_single(a)\n",
" s = 0.0\n",
" for i in a\n",
" s += heavycalc(i)\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "0e207add-1095-4941-a279-7b9309880e28",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-4614.057356857054"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"heavy_single(1:100_000)"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "ee943552-ee23-4c3e-98f0-b82c488758c8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 3 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m2.096 s\u001b[22m\u001b[39m … \u001b[35m 2.107 s\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m2.096 s \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m2.100 s\u001b[22m\u001b[39m ± \u001b[32m6.336 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[34m█\u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \n",
" \u001b[34m█\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m█\u001b[39m \u001b[39m▁\n",
" 2.1 s\u001b[90m Histogram: frequency by time\u001b[39m 2.11 s \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark heavy_single(1:100_000)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "9604b9b4-e6a1-4195-a186-d693b186d00c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5689.3445499193\n",
"-2262.538667311357\n",
"-6315.361970593135\n"
]
}
],
"source": [
"function heavy_multi_bad(a)\n",
" s = 0.0\n",
" @threads for i in a\n",
" s += heavycalc(i)\n",
" end\n",
" s\n",
"end\n",
"\n",
"println(heavy_multi_bad(1:100_000))\n",
"println(heavy_multi_bad(1:100_000))\n",
"println(heavy_multi_bad(1:100_000))"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "468073ac-a7cf-4ec7-b815-2b5afbdb1582",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"heavy_multi_atomic (generic function with 1 method)"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function heavy_multi_atomic(a)\n",
" s = Threads.Atomic{Float64}(0.0)\n",
" @threads for i in a\n",
" Threads.atomic_add!(s, heavycalc(i))\n",
" end\n",
" s\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "f85c0fa9-b4ec-446c-8a28-e824a122cdf2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Atomic{Float64}(-4614.057356857219)"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"heavy_multi_atomic(1:100_000)"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "ba1519d2-6616-47e4-9a18-be7c08f48b35",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 23 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m215.449 ms\u001b[22m\u001b[39m … \u001b[35m258.338 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m221.609 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m223.370 ms\u001b[22m\u001b[39m ± \u001b[32m 9.150 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m█\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m▁\u001b[34m▁\u001b[39m\u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m█\u001b[39m█\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m█\u001b[39m▁\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m▆\u001b[39m▆\u001b[32m▆\u001b[39m\u001b[39m█\u001b[39m▆\u001b[39m▁\u001b[39m█\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▆\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▆\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▆\u001b[39m \u001b[39m▁\n",
" 215 ms\u001b[90m Histogram: frequency by time\u001b[39m 258 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m9.11 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m100\u001b[39m."
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 最近のCPUではatomicでも意外と速く、多少使う程度ならこれで十分\n",
"@benchmark heavy_multi_atomic(1:100_000)"
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "a2fb8426-9d86-4b49-b034-68564cde0fe1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"heavy_multi_good (generic function with 1 method)"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function heavy_multi_good(a)\n",
" chunks = Iterators.partition(a, length(a) ÷ Threads.nthreads())\n",
" tasks = map(chunks) do chunk\n",
" Threads.@spawn heavy_single(chunk)\n",
" end\n",
" chunk_sums = fetch.(tasks)\n",
" return sum_single(chunk_sums) # ここは普通のsum\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "5f828d49-d132-42bc-9423-7f3e06e8c628",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-4614.05735685724"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"heavy_multi_good(1:100_000)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "fd928671-f338-4c6a-9f14-eaaf00c0ed2d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 23 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m213.069 ms\u001b[22m\u001b[39m … \u001b[35m250.759 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m217.223 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m221.545 ms\u001b[22m\u001b[39m ± \u001b[32m 10.476 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n",
"\n",
" \u001b[39m█\u001b[39m▃\u001b[39m█\u001b[39m \u001b[39m▃\u001b[34m \u001b[39m\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▁\u001b[39m█\u001b[34m▇\u001b[39m\u001b[39m█\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▇\u001b[39m▇\u001b[39m▇\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▇\u001b[39m \u001b[39m▁\n",
" 213 ms\u001b[90m Histogram: frequency by time\u001b[39m 251 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m9.39 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m123\u001b[39m."
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark heavy_multi_good(1:100_000)"
]
},
{
"cell_type": "markdown",
"id": "4ec38836-c023-4f6e-ae01-b4b87e6e57bc",
"metadata": {},
"source": [
"## マルチプロセス (Distributed.jl)\n",
"\n",
"juliaに -p オプションをつけて起動する必要がある。コンパイルが遅いが、ベンチマークを取るとそこそこ速い"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "a6a5b42c-c8ba-4431-b64f-2979ee03f883",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"-4614.057356857093\n",
" 598.968 ms (612 allocations: 26.66 KiB)\n",
"-4614.057356857093\n"
]
},
{
"data": {
"text/plain": [
"Process(`\u001b[4mjulia\u001b[24m \u001b[4m-p\u001b[24m \u001b[4m8\u001b[24m \u001b[4mchap7_distributed.jl\u001b[24m`, ProcessExited(0))"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filename = \"chap7_distributed.jl\"\n",
"open(filename, \"w\") do fp\n",
" print(fp, \"\"\"\n",
"using BenchmarkTools\n",
"using Distributed\n",
"\n",
"@everywhere function heavycalc(i)\n",
" c = 0.0\n",
" for j in 1:1000\n",
" c += cos(i*j)\n",
" end\n",
" c\n",
"end\n",
"\n",
"@everywhere function heavy_everywhere(iproc, nprocs, n)\n",
" step = n // (nprocs - 1)\n",
" a = 1 + (iproc - 1) * step:min(iproc * step, n)\n",
" s = 0.0\n",
" for i in a\n",
" s += heavycalc(i)\n",
" end\n",
" s\n",
"end\n",
"\n",
"function heavy_distributed(n)\n",
" tasks = pmap(1:nprocs()) do i\n",
" heavy_everywhere(i, nprocs(), n)\n",
" end\n",
" return sum(tasks)\n",
"end\n",
"println(heavy_distributed(100_000))\n",
"println(@btime heavy_distributed(100_000))\"\"\")\n",
"end\n",
"run(`julia -p 8 $(filename)`) # このような記法でコマンドの実行ができるらしい"
]
},
{
"cell_type": "markdown",
"id": "39b70244-c8f9-46fd-af2c-da6a918741b6",
"metadata": {},
"source": [
"## マルチプロセス (MPI)\n",
"https://www.r-ccs.riken.jp/outreach/schools/20230413-1/ にMPIの講義がある。 \n",
"ざっくりいうと、全プロセスが同じコードを動かしている。commはプロセッサたちをグループ分けするための機能だが、`COMM_WORLD`はプロセッサ全てが割り当たっている。 \n",
"`Comm_size` はプロセッサの数、`Comm_rank` は自身のプロセスが何番目のプロセッサに割り当たっているか。 \n",
"各プロセスが個別に計算を行って、 `Reduce` や `Allreduce` でプロセス間通信を行う。 `Reduce` は1プロセスに結果を集約する関数で、 rank `0` に割り当たっているプロセスに集約される。\n",
"\n",
"`MPI` パッケージのほか、MPIライブラリのパスを検出するために、 `MPIPreferences` パッケージもインストールが必要であった。\n",
"\n",
"コンパイル時間も込みで1.5sほどで行けた(Distributed.jlはコンパイルが遅かった)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ff3b6da0-0c8c-4841-b429-0c55add175bf",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[36m\u001b[1m┌ \u001b[22m\u001b[39m\u001b[36m\u001b[1mInfo: \u001b[22m\u001b[39mMPI implementation identified\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m libmpi = \"libmpi\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m version_string = \"Open MPI v4.1.5, package: Open MPI builduser@chris-1 Distribution, ident: 4.1.5, repo rev: v4.1.5, Feb 23, 2023\\0\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m impl = \"OpenMPI\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m version = v\"4.1.5\"\n",
"\u001b[36m\u001b[1m└ \u001b[22m\u001b[39m abi = \"OpenMPI\"\n",
"\u001b[36m\u001b[1m┌ \u001b[22m\u001b[39m\u001b[36m\u001b[1mInfo: \u001b[22m\u001b[39mMPIPreferences unchanged\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m binary = \"system\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m libmpi = \"libmpi\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m abi = \"OpenMPI\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m mpiexec = \"mpiexec\"\n",
"\u001b[36m\u001b[1m│ \u001b[22m\u001b[39m preloads = Any[]\n",
"\u001b[36m\u001b[1m└ \u001b[22m\u001b[39m preloads_env_switch = nothing\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007\u001b]0;Julia\u0007"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n",
"The latest version of Julia in the `release` channel is 1.10.1+0.x64.linux.gnu. You currently have `1.9.4+0.x64.linux.gnu` installed. Run:\n",
"\n",
" juliaup update\n",
"\n",
"to install Julia 1.10.1+0.x64.linux.gnu and update the `release` channel to that version.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"-4614.057356857094\n",
" 1.578316 seconds (8.71 k allocations: 684.477 KiB, 0.65% compilation time)\n"
]
},
{
"data": {
"text/plain": [
"Process(`\u001b[4mmpirun\u001b[24m \u001b[4m-np\u001b[24m \u001b[4m8\u001b[24m \u001b[4mjulia\u001b[24m \u001b[4mchap7_mpi.jl\u001b[24m`, ProcessExited(0))"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filename = \"chap7_mpi.jl\"\n",
"open(filename, \"w\") do fp\n",
" print(fp, \"\"\"\n",
"using BenchmarkTools\n",
"using MPI\n",
"\n",
"function heavycalc(i)\n",
" c = 0.0\n",
" for j in 1:1000\n",
" c += cos(i*j)\n",
" end\n",
" c\n",
"end\n",
"\n",
"function heavy_func(a)\n",
" s = 0.0\n",
" for i in a\n",
" s += heavycalc(i)\n",
" end\n",
" s\n",
"end\n",
"\n",
"function heavy_mpi(n)\n",
" comm = MPI.COMM_WORLD\n",
" nprocs = MPI.Comm_size(comm)\n",
" myrank = MPI.Comm_rank(comm)\n",
" step = n // nprocs\n",
" a = 1 + myrank * step:min((myrank + 1) * step, n)\n",
" s = 0.0\n",
" for i in a\n",
" s += heavycalc(i)\n",
" end\n",
" MPI.Reduce(s, MPI.SUM, comm)\n",
"end\n",
"\n",
"MPI.Init()\n",
"comm = MPI.COMM_WORLD\n",
"myrank = MPI.Comm_rank(comm)\n",
"result = heavy_mpi(100_000)\n",
"if myrank == 0\n",
" println(result)\n",
"end\n",
"MPI.Finalize()\"\"\")\n",
"end\n",
"\n",
"using MPIPreferences\n",
"MPIPreferences.use_system_binary()\n",
"@time run(`mpirun -np 8 julia $(filename)`)"
]
},
{
"cell_type": "markdown",
"id": "d018c2b6-3f7c-46f9-a027-051cad7ff84d",
"metadata": {},
"source": [
"# まとめ\n",
"- [Performance Tips](https://docs.julialang.org/en/v1/manual/performance-tips/) を読もう\n",
"- シングルノードでの並列計算はスレッドの利用がおすすめ"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df90e11c-6536-490a-9dc1-408c724b8c3a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.4",
"language": "julia",
"name": "julia-1.9"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment