Skip to content

Instantly share code, notes, and snippets.

@radekosmulski
Last active April 13, 2023 06:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save radekosmulski/82e7589c31ed210e3550552cc6d462d2 to your computer and use it in GitHub Desktop.
Save radekosmulski/82e7589c31ed210e3550552cc6d462d2 to your computer and use it in GitHub Desktop.
# Deploy a Single-Machine Multi-GPU Cluster
protocol = "tcp" # "tcp" or "ucx"
visible_devices = "0,1,2,3" # Delect devices to place workers
device_spill_frac = 0.9 # Spill GPU-Worker memory to host at this limit.
# Reduce if spilling fails to prevent
# device memory errors.
cluster = None # (Optional) Specify existing scheduler port
if cluster is None:
cluster = LocalCUDACluster(
protocol=protocol,
CUDA_VISIBLE_DEVICES=visible_devices,
local_directory=dask_workdir,
device_memory_limit=capacity * device_spill_frac,
)
# Create the distributed client
client = Client(cluster)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment