Skip to content

Instantly share code, notes, and snippets.

@jonatw
Last active June 18, 2018 14:31
Show Gist options
  • Save jonatw/9322813 to your computer and use it in GitHub Desktop.
Save jonatw/9322813 to your computer and use it in GitHub Desktop.
#!/bin/bash
# please install aws-cli first
# easy_install pip
# pip install awscli
# project parameters
PROJECT_NAME='SomeCoolPropject'
SAMPLE_COUNT=20;
INTERVAL=0.5;
# sampling gpu usage
TIMESTAMP=`date +%Y-%m-%dT%H:%M:%S.000Z`;
TMP_FILE_NAME=/tmp/GPU_$TIMESTAMP
for (( c=1; c<=$SAMPLE_COUNT; c++ )); do nvidia-smi --query-gpu=utilization.gpu,memory.free --format=csv,noheader,nounits >> /tmp/GPU_$TIMESTAMP; sleep $INTERVAL; done;
GPU_USG_SUM=`cat $TMP_FILE_NAME | sed 's/, / /g' | awk '{sum+=$1}END{print sum}'`
GPU_USG_MIN=`cat $TMP_FILE_NAME | sed 's/, / /g' |cut -d ' ' -f 1 | sort | head -n 1`
GPU_USG_MAX=`cat $TMP_FILE_NAME | sed 's/, / /g' |cut -d ' ' -f 1 | sort | tail -n 1`
GPU_MEM_SUM=`cat $TMP_FILE_NAME | sed 's/, / /g' | awk '{sum+=$2}END{print sum}'`
GPU_MEM_MIN=`cat $TMP_FILE_NAME | sed 's/, / /g' |cut -d ' ' -f 2 | sort | head -n 1`
GPU_MEM_MAX=`cat $TMP_FILE_NAME | sed 's/, / /g' |cut -d ' ' -f 2 | sort | tail -n 1`
# gatherating aws credential data and put it to cloudwatch
BASE_URL='http://169.254.169.254/latest/meta-data/'
export AWS_DEFAULT_REGION=`curl -s $BASE_URL/placement/availability-zone | sed -e 's:\([0-9][0-9]*\)[a-z]*\$:\\1:'`
export AWS_ROLE_NAME=`curl -s $BASE_URL/iam/security-credentials/`
export AWS_ACCESS_KEY_ID=`curl -s $BASE_URL/iam/security-credentials/$ROLE_NAME | grep AccessKeyId | awk '{print $3}' | cut -d '"' -f 2`
export AWS_SECRET_ACCESS_KEY=`curl -s $BASE_URL/iam/security-credentials/$ROLE_NAME | grep SecretAccessKey | awk '{print $3}' | cut -d '"' -f 2`
export AWS_SECURITY_TOKEN=`curl $BASE_URL/iam/security-credentials/$ROLE_NAME | grep Token | awk '{print $3}' | cut -d '"' -f 2`
INSTANCE_ID=`curl -s $BASE_URL/instance-id`
IMAGE_ID=`curl -s $BASE_URL/ami-id`
INSTANCE_TYPE=`curl -s $BASE_URL/instance-type`
aws cloudwatch put-metric-data --metric-name GpuUsage --namespace $PROJECT_NAME --statistic-value Sum=$GPU_USG_SUM,Minimum=$GPU_USG_MIN,Maximum=$GPU_USG_MAX,SampleCount=$SAMPLE_COUNT --timestamp $TIMESTAMP --unit 'Percent' --dimensions InstanceId=$INSTANCE_ID,ImageId=$IMAGE_ID,InstanceType=$INSTANCE_TYPE
aws cloudwatch put-metric-data --metric-name GpuMemory --namespace $PROJECT_NAME --statistic-value Sum=$GPU_MEM_SUM,Minimum=$GPU_MEM_MIN,Maximum=$GPU_MEM_MAX,SampleCount=$SAMPLE_COUNT --timestamp $TIMESTAMP --unit 'Megabytes' --dimensions InstanceId=$INSTANCE_ID,ImageId=$IMAGE_ID,InstanceType=$INSTANCE_TYPE
# remove tmp data
rm $TMP_FILE_NAME
@jungopro
Copy link

Hi

Great stuff, I started using it and it looks awsome

But, note that you use AWS_ROLE_NAME in line 27 but then you use $ROLE_NAME in lines 28-30... You should modify the script to make it work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment