Skip to content

Instantly share code, notes, and snippets.

@josemarcosrf
Last active November 13, 2023 20:11
Show Gist options
  • Save josemarcosrf/2a7e124eb1b5d05217d7e7ebf67a664f to your computer and use it in GitHub Desktop.
Save josemarcosrf/2a7e124eb1b5d05217d7e7ebf67a664f to your computer and use it in GitHub Desktop.
Instructions on installing nvidia-time-slicing in kubernetes
  1. Define the time-slicing configuration.

    a. Create a file called time-slicing-config-all.yaml with the following contents:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: time-slicing-config-all
    data:
      any: |-
        version: v1
        flags:
          migStrategy: none
        sharing:
          timeSlicing:
            resources:
            - name: nvidia.com/gpu
              replicas: 4

    b. Create the namespace for the operator and add the config map

    kubectl create ns gpu-operator
    kubectl create -n gpu-operator -f time-slicing-config-all.yaml
  2. Add nvidia's helm repository

See: docs.nvidia

# helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
helm install gpu-operator nvidia/gpu-operator \
    -n gpu-operator \
    --set devicePlugin.config.name=time-slicing-config-all
  1. Apply the time-slicing configuration

See docs.nvidia

kubectl patch clusterpolicy/cluster-policy \
    -n gpu-operator --type merge \
    -p '{"spec": {"devicePlugin": {"config": {"name": "time-slicing-config-all", "default": "any"}}}}'

kubectl label node <node-name> nvidia.com/device-plugin.config=any
  1. Verify time-slicing

see docs.nvidia

kubectl describe node <node-name>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment