Skip to content

Instantly share code, notes, and snippets.

@Halliax
Halliax / gpu-spot-group.yaml
Last active January 10, 2020 21:15
A section of a CloudFormation template for a GPU mixed-instance node group
GPUSpotNodeGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Sub "${ClusterName}-${NodeGroupName}"
DesiredCapacity: !Ref NodeAutoScalingGroupDesiredSize # 0
MinSize: !Ref NodeAutoScalingGroupMinSize # 0
MaxSize: !Ref NodeAutoScalingGroupMaxSize # 10, arbitrarily
MixedInstancesPolicy:
InstancesDistribution:
OnDemandBaseCapacity: !Ref OnDemandBaseCapacity # 0
@Halliax
Halliax / launch-template-user-data.yaml
Last active January 10, 2020 21:49
A section of an EC2 launch template demonstrating usage of the AMI bootstrap.sh script
Parameters:
...
BootstrapArgumentsForSpotFleet:
Description: Arguments to pass to the bootstrap script. See files/bootstrap.sh in https://github.com/awslabs/amazon-eks-ami
Type: String
Default: "--kubelet-extra-args '--node-labels=lifecycle=Ec2Spot,nvidia.com/gpu=true,k8s.amazonaws.com/accelerator=nvidia-tesla
--register-with-taints=spotInstance=true:PreferNoSchedule,nvidia.com/gpu=true:NoSchedule'"
...
...
Resources:
@Halliax
Halliax / example-gpu-deployment.yaml
Last active February 24, 2020 21:59
Simple working example of a k8s Deployment configured to run on the GPU nodes
apiVersion: apps/v1
kind: Deployment
metadata:
name: cuda-vector-add
labels:
app: cuda-vector-add
spec:
replicas: 3
selector:
matchLabels:
@Halliax
Halliax / example-gpu-argo-workflow.yaml
Created February 24, 2020 21:59
Simple working example of an Argo Workflow configured to run on the GPU nodes
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: cuda-vector-add-
spec:
entrypoint: main
templates:
- name: main
# requires this pod to be run on an nvidia.com/gpu labeled node
nodeSelector: