Created
April 15, 2020 16:17
-
-
Save hamelin/6dd43f14e3e5e34cd53853b134e95b27 to your computer and use it in GitHub Desktop.
Distributing Python computations with Dask
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Distributing Python computations with Dask Distributed\n", | |
"\n", | |
"First step is to install Dask and its Distributed companion package. Installing Bokeh along will provide the awesome visual cluster monitoring tool." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Requirement already satisfied: dask in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (2.14.0)\n", | |
"Requirement already satisfied: distributed in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (2.14.0)\n", | |
"Requirement already satisfied: bokeh in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (2.0.1)\n", | |
"Requirement already satisfied: cloudpickle>=0.2.2 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (1.3.0)\n", | |
"Requirement already satisfied: tornado>=6.0.3; python_version >= \"3.8\" in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (6.0.4)\n", | |
"Requirement already satisfied: psutil>=5.0 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (5.7.0)\n", | |
"Requirement already satisfied: tblib>=1.6.0 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (1.6.0)\n", | |
"Requirement already satisfied: msgpack>=0.6.0 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (1.0.0)\n", | |
"Requirement already satisfied: toolz>=0.8.2 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (0.10.0)\n", | |
"Requirement already satisfied: click>=6.6 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (7.1.1)\n", | |
"Requirement already satisfied: zict>=0.1.3 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (2.0.0)\n", | |
"Requirement already satisfied: pyyaml in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (5.3.1)\n", | |
"Requirement already satisfied: setuptools in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (46.1.1.post20200323)\n", | |
"Requirement already satisfied: sortedcontainers!=2.0.0,!=2.0.1 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from distributed) (2.1.0)\n", | |
"Requirement already satisfied: numpy>=1.11.3 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (1.18.2)\n", | |
"Requirement already satisfied: Jinja2>=2.7 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (2.11.1)\n", | |
"Requirement already satisfied: packaging>=16.8 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (20.3)\n", | |
"Requirement already satisfied: typing-extensions>=3.7.4 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (3.7.4.1)\n", | |
"Requirement already satisfied: pillow>=4.0 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (7.0.0)\n", | |
"Requirement already satisfied: python-dateutil>=2.1 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from bokeh) (2.8.1)\n", | |
"Requirement already satisfied: heapdict in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from zict>=0.1.3->distributed) (1.0.1)\n", | |
"Requirement already satisfied: MarkupSafe>=0.23 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from Jinja2>=2.7->bokeh) (1.1.1)\n", | |
"Requirement already satisfied: pyparsing>=2.0.2 in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from packaging>=16.8->bokeh) (2.4.6)\n", | |
"Requirement already satisfied: six in /home/ben/miniconda3/envs/crab/lib/python3.8/site-packages (from packaging>=16.8->bokeh) (1.14.0)\n" | |
] | |
} | |
], | |
"source": [ | |
"!pip install dask distributed bokeh" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import dask\n", | |
"import dask.distributed as dist\n", | |
"import multiprocessing as mp" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a Dask distributed computing cluster. Let's have one process for each CPU." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"20" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"num_cpus = mp.cpu_count()\n", | |
"num_cpus" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"application/vnd.jupyter.widget-view+json": { | |
"model_id": "3d7c9f6314674865af78421996e27d3c", | |
"version_major": 2, | |
"version_minor": 0 | |
}, | |
"text/plain": [ | |
"VBox(children=(HTML(value='<h2>LocalCluster</h2>'), HBox(children=(HTML(value='\\n<div>\\n <style scoped>\\n …" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"cluster = dist.LocalCluster(n_workers=num_cpus, threads_per_worker=1)\n", | |
"cluster" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"See that link above? Open it in another window to track the status of your cluster, and the jobs you run on it.\n", | |
"\n", | |
"Now, let's figure out a lengthy computation you must run. This computation should have as entry point a *function*. Given the scope of this intro to Dask, let's assume it will write its results to disk, and that whatever it returns will of small size (like a path to the file storing the result, for instance).\n", | |
"\n", | |
"For the sake of example, I will run a computation that simply sleeps. Stupid but demonstrates the parallelization abilities of Dask." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import time\n", | |
"\n", | |
"\n", | |
"def work(num_seconds: float) -> None:\n", | |
" time.sleep(num_seconds)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's have a list of each task to make -- each corresponds to a number of seconds to sleep." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[1,\n", | |
" 2,\n", | |
" 3,\n", | |
" 4,\n", | |
" 5,\n", | |
" 6,\n", | |
" 7,\n", | |
" 8,\n", | |
" 9,\n", | |
" 10,\n", | |
" 11,\n", | |
" 12,\n", | |
" 13,\n", | |
" 14,\n", | |
" 15,\n", | |
" 16,\n", | |
" 17,\n", | |
" 18,\n", | |
" 19,\n", | |
" 20,\n", | |
" 21,\n", | |
" 22,\n", | |
" 23,\n", | |
" 24,\n", | |
" 25,\n", | |
" 26,\n", | |
" 27,\n", | |
" 28,\n", | |
" 29,\n", | |
" 30,\n", | |
" 31,\n", | |
" 32,\n", | |
" 33,\n", | |
" 34,\n", | |
" 35,\n", | |
" 36,\n", | |
" 37,\n", | |
" 38,\n", | |
" 39,\n", | |
" 40,\n", | |
" 41,\n", | |
" 42,\n", | |
" 43,\n", | |
" 44,\n", | |
" 45,\n", | |
" 46,\n", | |
" 47,\n", | |
" 48,\n", | |
" 49,\n", | |
" 50,\n", | |
" 51,\n", | |
" 52,\n", | |
" 53,\n", | |
" 54,\n", | |
" 55,\n", | |
" 56,\n", | |
" 57,\n", | |
" 58,\n", | |
" 59,\n", | |
" 60]" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"tasks = [n + 1 for n in range(60)]\n", | |
"tasks" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"For each task, you want a *Dask-delayed* function to handle it. When you call a Dask-delayed function, you get a *plan* object to compute the result." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[Delayed('work-a800804d-b23a-4a35-986f-61dcf013cc48'),\n", | |
" Delayed('work-e9f16c9f-4b61-4d58-b93e-275de967e83c'),\n", | |
" Delayed('work-0a3c5326-9622-4a47-b5bc-258dbf50b0e0'),\n", | |
" Delayed('work-bbc580e7-c484-40e0-91c2-65723cf52cda'),\n", | |
" Delayed('work-08a1de31-4e97-4725-9e6b-8f22c70bbada'),\n", | |
" Delayed('work-d9c72999-0720-4dc8-a5a6-fc864ffd76ab'),\n", | |
" Delayed('work-4448211f-0dc3-4b6a-b5df-4da57ebb2ebb'),\n", | |
" Delayed('work-a7f70243-4f0f-4e08-9a61-c395275f18fe'),\n", | |
" Delayed('work-145e3c15-f90c-4ce7-b478-918d8c34e322'),\n", | |
" Delayed('work-ba5405e2-06f6-42ee-8674-24d327996c97'),\n", | |
" Delayed('work-d29b9dc3-de93-4ef8-9f60-479dbbe26fef'),\n", | |
" Delayed('work-e5c9849d-4853-4d97-a519-d6cba12380a2'),\n", | |
" Delayed('work-a2373324-eeaf-4107-846f-b1e19878ca42'),\n", | |
" Delayed('work-1744d1fd-026b-44fa-8262-504f4db466a5'),\n", | |
" Delayed('work-10503822-7324-4931-a373-cfb0821505af'),\n", | |
" Delayed('work-4d0c3a43-2641-4795-bfdc-8c5309d68e11'),\n", | |
" Delayed('work-93baffe1-5ee2-4ebd-9757-9229a7d9d5fd'),\n", | |
" Delayed('work-4c6ea42f-a6f1-4cd1-958f-129b40b16985'),\n", | |
" Delayed('work-3c99b8cf-c788-463a-ad8b-e50c998cb95b'),\n", | |
" Delayed('work-1b4f7403-9f03-4476-b882-48c48bd18130'),\n", | |
" Delayed('work-96484ee4-1fe6-4d47-8e4c-6248ba875d46'),\n", | |
" Delayed('work-84300e43-8e56-43d3-80bf-356af1e868dc'),\n", | |
" Delayed('work-aabb7e60-d9fc-4b45-9640-56bfa4b2dcc6'),\n", | |
" Delayed('work-705c536b-6f51-4b77-8915-6437ab506cff'),\n", | |
" Delayed('work-0374bd15-5387-4a38-b78e-b79fa1094e0b'),\n", | |
" Delayed('work-7bb59541-657d-406c-9ae2-37eeb79dd296'),\n", | |
" Delayed('work-b8ac649a-0a4b-4524-982b-616449e13c17'),\n", | |
" Delayed('work-fa4f8bff-9d82-46b3-8ead-08d1e1f8eebc'),\n", | |
" Delayed('work-9881940f-9015-434c-afb4-62c2875b7593'),\n", | |
" Delayed('work-92f18db9-9430-41e5-9d72-282ec6a61238'),\n", | |
" Delayed('work-25f45462-1655-4492-bfb1-17266204fc76'),\n", | |
" Delayed('work-34d8dca7-1103-4f2d-a9af-b069b9e1c65c'),\n", | |
" Delayed('work-80697218-ad57-474c-8a07-dfcc1fbf9445'),\n", | |
" Delayed('work-bfd5ff5a-10fb-45f5-b6fc-4272898963ba'),\n", | |
" Delayed('work-37834b37-2023-4883-a50c-7e14e3206e1b'),\n", | |
" Delayed('work-aeb862bf-f426-48d1-8524-93e16b7e22ff'),\n", | |
" Delayed('work-b9965052-3d43-4e44-a10a-00b188d74af4'),\n", | |
" Delayed('work-c5c7b4b9-71d4-483e-b190-f0536ce5d71f'),\n", | |
" Delayed('work-4435e172-3939-4bd9-86fb-d4d6bd8fe8a0'),\n", | |
" Delayed('work-39b3e219-e2de-481a-b994-4dd9375eb8d9'),\n", | |
" Delayed('work-0b0699bc-8bd9-4d11-945b-d2e8e46ae575'),\n", | |
" Delayed('work-b4c88650-7f71-4be7-a966-9087122012ca'),\n", | |
" Delayed('work-13de8836-91f7-4ce0-83f3-01133d4ff71b'),\n", | |
" Delayed('work-b2525563-8686-41cf-90f7-96d28e86bcb1'),\n", | |
" Delayed('work-c47e99e5-5e78-45de-b18a-899c73aa1913'),\n", | |
" Delayed('work-eee18fc2-1d5e-4ed2-b48b-20b2f6b6c382'),\n", | |
" Delayed('work-314950a7-1cee-4575-8779-9dab1c3e4eb1'),\n", | |
" Delayed('work-d88685ce-0e6c-42f1-a23a-d7ca6fd78654'),\n", | |
" Delayed('work-68702029-95b1-48f7-9e2d-d17a4fa51b44'),\n", | |
" Delayed('work-9fc72157-cc92-45bc-8a90-dc3125866e03'),\n", | |
" Delayed('work-a7558bf5-45f4-4ade-a750-0e255017d0d9'),\n", | |
" Delayed('work-7c189878-cd8d-45b8-a389-5e930a8897f4'),\n", | |
" Delayed('work-e7f1a54d-e709-49bd-9e13-b08328b1ea10'),\n", | |
" Delayed('work-82b55431-e255-4b64-a956-de7ef31c834e'),\n", | |
" Delayed('work-f89e9e4e-1f75-49e0-9ee0-22f72cf1792d'),\n", | |
" Delayed('work-c71ddbf5-2cc5-449e-b0eb-0efd6887c5e6'),\n", | |
" Delayed('work-f0d55eb3-0053-4460-a0ad-feacbe097316'),\n", | |
" Delayed('work-85f9b87e-3350-46f6-bd24-642d5aa123e3'),\n", | |
" Delayed('work-38aa1c15-a3b7-4651-b252-219b31875607'),\n", | |
" Delayed('work-07d55db6-f773-4ed0-aec5-81402d0ea9f0')]" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"plans = [dask.delayed(work)(n) for n in tasks]\n", | |
"plans" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To execute all the plans, we need to connect to the cluster (with a `Client` instance) and `compute` them. You get a *future* object for each plan you thus compute." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[<Future: pending, key: work-a800804d-b23a-4a35-986f-61dcf013cc48>,\n", | |
" <Future: pending, key: work-e9f16c9f-4b61-4d58-b93e-275de967e83c>,\n", | |
" <Future: pending, key: work-0a3c5326-9622-4a47-b5bc-258dbf50b0e0>,\n", | |
" <Future: pending, key: work-bbc580e7-c484-40e0-91c2-65723cf52cda>,\n", | |
" <Future: pending, key: work-08a1de31-4e97-4725-9e6b-8f22c70bbada>,\n", | |
" <Future: pending, key: work-d9c72999-0720-4dc8-a5a6-fc864ffd76ab>,\n", | |
" <Future: pending, key: work-4448211f-0dc3-4b6a-b5df-4da57ebb2ebb>,\n", | |
" <Future: pending, key: work-a7f70243-4f0f-4e08-9a61-c395275f18fe>,\n", | |
" <Future: pending, key: work-145e3c15-f90c-4ce7-b478-918d8c34e322>,\n", | |
" <Future: pending, key: work-ba5405e2-06f6-42ee-8674-24d327996c97>,\n", | |
" <Future: pending, key: work-d29b9dc3-de93-4ef8-9f60-479dbbe26fef>,\n", | |
" <Future: pending, key: work-e5c9849d-4853-4d97-a519-d6cba12380a2>,\n", | |
" <Future: pending, key: work-a2373324-eeaf-4107-846f-b1e19878ca42>,\n", | |
" <Future: pending, key: work-1744d1fd-026b-44fa-8262-504f4db466a5>,\n", | |
" <Future: pending, key: work-10503822-7324-4931-a373-cfb0821505af>,\n", | |
" <Future: pending, key: work-4d0c3a43-2641-4795-bfdc-8c5309d68e11>,\n", | |
" <Future: pending, key: work-93baffe1-5ee2-4ebd-9757-9229a7d9d5fd>,\n", | |
" <Future: pending, key: work-4c6ea42f-a6f1-4cd1-958f-129b40b16985>,\n", | |
" <Future: pending, key: work-3c99b8cf-c788-463a-ad8b-e50c998cb95b>,\n", | |
" <Future: pending, key: work-1b4f7403-9f03-4476-b882-48c48bd18130>,\n", | |
" <Future: pending, key: work-96484ee4-1fe6-4d47-8e4c-6248ba875d46>,\n", | |
" <Future: pending, key: work-84300e43-8e56-43d3-80bf-356af1e868dc>,\n", | |
" <Future: pending, key: work-aabb7e60-d9fc-4b45-9640-56bfa4b2dcc6>,\n", | |
" <Future: pending, key: work-705c536b-6f51-4b77-8915-6437ab506cff>,\n", | |
" <Future: pending, key: work-0374bd15-5387-4a38-b78e-b79fa1094e0b>,\n", | |
" <Future: pending, key: work-7bb59541-657d-406c-9ae2-37eeb79dd296>,\n", | |
" <Future: pending, key: work-b8ac649a-0a4b-4524-982b-616449e13c17>,\n", | |
" <Future: pending, key: work-fa4f8bff-9d82-46b3-8ead-08d1e1f8eebc>,\n", | |
" <Future: pending, key: work-9881940f-9015-434c-afb4-62c2875b7593>,\n", | |
" <Future: pending, key: work-92f18db9-9430-41e5-9d72-282ec6a61238>,\n", | |
" <Future: pending, key: work-25f45462-1655-4492-bfb1-17266204fc76>,\n", | |
" <Future: pending, key: work-34d8dca7-1103-4f2d-a9af-b069b9e1c65c>,\n", | |
" <Future: pending, key: work-80697218-ad57-474c-8a07-dfcc1fbf9445>,\n", | |
" <Future: pending, key: work-bfd5ff5a-10fb-45f5-b6fc-4272898963ba>,\n", | |
" <Future: pending, key: work-37834b37-2023-4883-a50c-7e14e3206e1b>,\n", | |
" <Future: pending, key: work-aeb862bf-f426-48d1-8524-93e16b7e22ff>,\n", | |
" <Future: pending, key: work-b9965052-3d43-4e44-a10a-00b188d74af4>,\n", | |
" <Future: pending, key: work-c5c7b4b9-71d4-483e-b190-f0536ce5d71f>,\n", | |
" <Future: pending, key: work-4435e172-3939-4bd9-86fb-d4d6bd8fe8a0>,\n", | |
" <Future: pending, key: work-39b3e219-e2de-481a-b994-4dd9375eb8d9>,\n", | |
" <Future: pending, key: work-0b0699bc-8bd9-4d11-945b-d2e8e46ae575>,\n", | |
" <Future: pending, key: work-b4c88650-7f71-4be7-a966-9087122012ca>,\n", | |
" <Future: pending, key: work-13de8836-91f7-4ce0-83f3-01133d4ff71b>,\n", | |
" <Future: pending, key: work-b2525563-8686-41cf-90f7-96d28e86bcb1>,\n", | |
" <Future: pending, key: work-c47e99e5-5e78-45de-b18a-899c73aa1913>,\n", | |
" <Future: pending, key: work-eee18fc2-1d5e-4ed2-b48b-20b2f6b6c382>,\n", | |
" <Future: pending, key: work-314950a7-1cee-4575-8779-9dab1c3e4eb1>,\n", | |
" <Future: pending, key: work-d88685ce-0e6c-42f1-a23a-d7ca6fd78654>,\n", | |
" <Future: pending, key: work-68702029-95b1-48f7-9e2d-d17a4fa51b44>,\n", | |
" <Future: pending, key: work-9fc72157-cc92-45bc-8a90-dc3125866e03>,\n", | |
" <Future: pending, key: work-a7558bf5-45f4-4ade-a750-0e255017d0d9>,\n", | |
" <Future: pending, key: work-7c189878-cd8d-45b8-a389-5e930a8897f4>,\n", | |
" <Future: pending, key: work-e7f1a54d-e709-49bd-9e13-b08328b1ea10>,\n", | |
" <Future: pending, key: work-82b55431-e255-4b64-a956-de7ef31c834e>,\n", | |
" <Future: pending, key: work-f89e9e4e-1f75-49e0-9ee0-22f72cf1792d>,\n", | |
" <Future: pending, key: work-c71ddbf5-2cc5-449e-b0eb-0efd6887c5e6>,\n", | |
" <Future: pending, key: work-f0d55eb3-0053-4460-a0ad-feacbe097316>,\n", | |
" <Future: pending, key: work-85f9b87e-3350-46f6-bd24-642d5aa123e3>,\n", | |
" <Future: pending, key: work-38aa1c15-a3b7-4651-b252-219b31875607>,\n", | |
" <Future: pending, key: work-07d55db6-f773-4ed0-aec5-81402d0ea9f0>]" | |
] | |
}, | |
"execution_count": 8, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"client = dist.Client(cluster)\n", | |
"futures = client.compute(plans)\n", | |
"futures" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Each of the futures can be itself `compute`d in order to wait for the corresponding computation to complete. The result of computing a future is the return value of the application of the Dask-delayed function. If you want to wait for a all your futures, you can do so more easily by `gather`ing them. This call only returns once all the futures are done with." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None,\n", | |
" None]" | |
] | |
}, | |
"execution_count": 9, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"client.gather(futures)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You can then run other computations against your cluster. You can also have Dask-delayed functions take up plans as parameters to structure dependency graphs... But I'll leave you to read the [official documentation](https://dask.org/) to understand that. To flush away your jobs once you're done:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"client.close()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Should the cluster get in a weird state, you can also restart all its worker processes:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n", | |
"distributed.nanny - WARNING - Restarting worker\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"<table style=\"border: 2px solid white;\">\n", | |
"<tr>\n", | |
"<td style=\"vertical-align: top; border: 0px solid white\">\n", | |
"<h3 style=\"text-align: left;\">Client</h3>\n", | |
"<ul style=\"text-align: left; list-style: none; margin: 0; padding: 0;\">\n", | |
" <li><b>Scheduler: </b>tcp://127.0.0.1:45683</li>\n", | |
" <li><b>Dashboard: </b><a href='http://127.0.0.1:8787/status' target='_blank'>http://127.0.0.1:8787/status</a>\n", | |
"</ul>\n", | |
"</td>\n", | |
"<td style=\"vertical-align: top; border: 0px solid white\">\n", | |
"<h3 style=\"text-align: left;\">Cluster</h3>\n", | |
"<ul style=\"text-align: left; list-style:none; margin: 0; padding: 0;\">\n", | |
" <li><b>Workers: </b>20</li>\n", | |
" <li><b>Cores: </b>20</li>\n", | |
" <li><b>Memory: </b>84.44 GB</li>\n", | |
"</ul>\n", | |
"</td>\n", | |
"</tr>\n", | |
"</table>" | |
], | |
"text/plain": [ | |
"<Client: 'tcp://127.0.0.1:45683' processes=20 threads=20, memory=84.44 GB>" | |
] | |
}, | |
"execution_count": 12, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"client = dist.Client(cluster)\n", | |
"client.restart()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"When all done, just close both the client and the cluster. Note that if you just kill your Python kernel, you also free these resources." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"client.close()\n", | |
"cluster.close()" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment