Skip to content

Instantly share code, notes, and snippets.

@brews
Last active October 31, 2023 16:22
Show Gist options
  • Save brews/f0f3946007cad8c4c379919149d9cf32 to your computer and use it in GitHub Desktop.
Save brews/f0f3946007cad8c4c379919149d9cf32 to your computer and use it in GitHub Desktop.
Argo Workflow demo to launch kubernetes dask distributed job
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: dask-test-
spec:
entrypoint: dask
activeDeadlineSeconds: 1800 # Safety first, kids!
templates:
- name: dask
script:
image: daskdev/dask:2.30.0
env:
- name: EXTRA_PIP_PACKAGES
value: "dask-kubernetes dask[distributed]"
command: ["python"]
source: |
# Hack to get the dask container entrypoint to fire-off, otherwise
# container skips EXTRA_PIP_PACKAGES install
import subprocess
subprocess.run(
["bash", "/usr/bin/prepare.sh"], stdout=subprocess.PIPE, universal_newlines=True
)
# ...And now for something completely different...
from dask.distributed import Client
from dask_kubernetes import KubeCluster, make_pod_spec
import dask.array as da
pod_spec = make_pod_spec(
image="daskdev/dask:latest",
memory_limit="500Mi",
memory_request="500m",
cpu_limit=1,
cpu_request=1,
env={"EXTRA_PIP_PACKAGES": "dask[distributed]"},
)
cluster = KubeCluster(pod_spec)
cluster.scale(10) # specify number of workers explicitly
client = Client(cluster)
array = da.ones((1000, 1000, 1000))
print(f"OUR ANSWER IS: {array.mean().compute()} (...it should be 1.0 ...)")
resources:
limits:
memory: 500Mi
cpu: 500m
@delgadom
Copy link

delgadom commented Aug 5, 2020

this is sweet

@brews
Copy link
Author

brews commented Mar 2, 2022

⚠️ Keep in mind that the ServiceAccount running the above workflow needs permissions to create pods in the namespace to work — unfortunately, this is enough to escalate the workload's permissions in the namespace. Beware the security implications.

@emileten
Copy link

emileten commented Mar 8, 2022

@brews so, did you have to manually give this ServiceAccount the right permissions ? Like explained here in the RBAC section ?

@brews
Copy link
Author

brews commented Mar 8, 2022

@emileten Yeah, and if you're working on a fresh un-configured cluster you also need to ensure the ServiceAccount has permissions like here https://argoproj.github.io/argo-workflows/workflow-rbac/, in addition to those you pointed to above in your link.

@RamonGal
Copy link

RamonGal commented Aug 4, 2023

thank you for this, this helped me on so many levels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment