Skip to content

Instantly share code, notes, and snippets.

@zonca
Last active April 28, 2022 18:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zonca/355a7ec6b5bd3f84b1413a8c29fbc877 to your computer and use it in GitHub Desktop.
Save zonca/355a7ec6b5bd3f84b1413a8c29fbc877 to your computer and use it in GitHub Desktop.
Test Dask Gateway
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Downgrade packages\n",
"\n",
"Dask gateway 0.9.0 needs an older version of `dask`,\n",
"this will probably not be necessary with a newer version of the Dask Gateway Helm chart:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade dask distributed dask-gateway"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'2022.4.0'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import dask_gateway, dask, distributed\n",
"dask_gateway.__version__"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'2022.4.1'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"distributed.__version__"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'2022.04.1'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dask.__version__"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from dask_gateway import Gateway"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We access Dask Gateway via the JupyterHub service"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gateway = Gateway(\n",
" address=\"http://traefik-dask-gateway/services/dask-gateway/\",\n",
" public_address=\"https://sg.zonca.dev/services/dask-gateway/\",\n",
" auth=\"jupyterhub\")\n",
"gateway.list_clusters()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"options = gateway.cluster_options()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "04dce4262ee04e8cb0f1ff30b12f8ba2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='<h2>Cluster Options</h2>'), GridBox(children=(HTML(value=\"<p style='font-weight: bo…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"options"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"cluster = gateway.new_cluster(options)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"cluster.scale(2)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[ClusterReport<name=jhub.83e3ee6ec1204fde9a77e4425026dfd7, status=RUNNING>]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gateway.list_clusters()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:1289: VersionMismatchWarning: Mismatched versions found\n",
"\n",
"+-------------+--------+-----------+---------+\n",
"| Package | client | scheduler | workers |\n",
"+-------------+--------+-----------+---------+\n",
"| cloudpickle | 1.6.0 | 2.0.0 | None |\n",
"| msgpack | 1.0.2 | 1.0.3 | None |\n",
"| toolz | 0.11.1 | 0.11.2 | None |\n",
"+-------------+--------+-----------+---------+\n",
"Notes: \n",
"- msgpack: Variation is ok, as long as everything is above 0.6\n",
" warnings.warn(version_module.VersionMismatchWarning(msg[0][\"warning\"]))\n"
]
}
],
"source": [
"client = cluster.get_client()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Access the famous dask dashboard"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "810b2bc8145849609827ae4d1ff57e80",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\\n<div>\\n<style scoped>\\n …"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cluster"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run an example distributed computation"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"import dask.bag as db"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"data = list(range(1,9))\n",
"b = db.from_sequence(data)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dask.bag<slow_half, npartitions=8>"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from time import sleep\n",
"\n",
"def slow_half( x):\n",
" sleep(1)\n",
" return x // 2\n",
"\n",
"res = b.map(slow_half)\n",
"res"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 53.9 ms, sys: 50.9 ms, total: 105 ms\n",
"Wall time: 4.23 s\n"
]
},
{
"data": {
"text/plain": [
"[0, 1, 1, 2, 2, 3, 3, 4]"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"res.compute()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### With dask array\n",
"\n",
"Unfortunately this doesn't work because the default workers do not have `numpy` installed"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import dask.array as da\n",
"a = da.random.normal(size=(10000, 10000), chunks=(500, 500))\n",
"a.mean().compute()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# cluster.shutdown()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
@zonca
Copy link
Author

zonca commented Aug 11, 2020

Screenshot of the Widgets

image

Screenshot of the Dashboard

image

Screenshot of the Workers tab

image

@zonca
Copy link
Author

zonca commented Jan 28, 2022

I have decommissioned the cluster, so details are not relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment