Skip to content

Instantly share code, notes, and snippets.

@JustinShenk
Last active January 22, 2024 20:45
Show Gist options
  • Star 40 You must be signed in to star a gist
  • Fork 18 You must be signed in to fork a gist
  • Save JustinShenk/312b5e0ab7acc3b116f7bf3b6d888fa4 to your computer and use it in GitHub Desktop.
Save JustinShenk/312b5e0ab7acc3b116f7bf3b6d888fa4 to your computer and use it in GitHub Desktop.
Google Cloud Platform (GCP) instance idle shutdown
#!/bin/bash
# Add to instance metadata with `gcloud compute instances add-metadata \
# instance-name --metadata-from-file startup-script=idle-shutdown.sh` and reboot
# NOTE: requires `bc`, eg, sudo apt-get install bc
# Modified from https://stackoverflow.com/questions/30556920/how-can-i-automatically-kill-idle-gce-instances-based-on-cpu-usage
threshold=0.1
count=0
wait_minutes=60
while true
do
load=$(uptime | sed -e 's/.*load average: //g' | awk '{ print $1 }') # 1-minute average load
load="${load//,}" # remove trailing comma
res=$(echo $load'<'$threshold | bc -l)
if (( $res ))
then
echo "Idling.."
((count+=1))
fi
echo "Idle minutes count = $count"
if (( count>wait_minutes ))
then
echo Shutting down
# wait a little bit more before actually pulling the plug
sleep 300
sudo poweroff
fi
sleep 60
done
@morphcatalyst
Copy link

Justin, could you elaborate on how to implement this script? I am new to bin/bash/.sh stuff.
I have created a win10 vm in GCP and I want it to auto shut down after 60 min of inactivity.

Thanks,
Amit

@viveksamaga
Copy link

If you are still looking for the answer by anychance here it is below,
Go to GCP shell, create the above file there and run below
gcloud compute instances add-metadata nested-vm-image1 --zone=<> --metadata-from-file startup-script=idle-shutdown.sh

@SwiftWinds
Copy link

If you are still looking for the answer by anychance here it is below,
Go to GCP shell, create the above file there and run below
gcloud compute instances add-metadata nested-vm-image1 --zone=<> --metadata-from-file startup-script=idle-shutdown.sh

@viveksamaga I think @morphcatalyst was asking about a way to do this on a win10 VM. I'm in a similar situation. I don't think Windows works with bash, so it won't be able to execute that startup script?

@bcli4d
Copy link

bcli4d commented Apr 7, 2021

Shouldn't count be reset to 0 if res is False?

@Nempickaxe
Copy link

@jarrodonlo
Copy link

I'm curious about @bcli4d comment...

Shouldn't count be reset to 0 if res is False?

...isn't he right, shouldn't count=0?

@jarrodonlo
Copy link

I'm curious (but not trying to imply your method is "wrong") why are you not running this as a cron job?

@jarrodonlo
Copy link

I really appreciate the script. Idle shutdown is a feature that I want in a VM. It gives me peace of mind that my bill won't creep up for unused dev box hours. I see that GCP's Datalab has a beta VM Auto Shutdown feature, so maybe we'll get it backed in with other GCP VMs in the future. I believe that Azure already has such a feature baked in ...but who has time to get familiar with another cloud and tooing?

@JustinShenk
Copy link
Author

JustinShenk commented Jan 6, 2022

Good point, no reason not to do a cron job. Feel free to share your working script here as well

@jarrodonlo
Copy link

I'm not as familiar with GCP VM's metadata and startup-scripts, but looking at it further I definitely see advantages (opposed to cron) to doing it the way you are.

  1. No need to keep track of state between cron invocations.
  2. Seems more modular for other VMs and groups.

but I'm still curious..

Shouldn't count be reset to 0 if res is False?

@ravwojdyla
Copy link

ravwojdyla commented Jan 29, 2022

Here's a version that suspends (check the limitation of suspend here) the VM if the 15' load average is consecutively less than the threshold (default: 1 core) for an hour. One way to run it is: nohup bash auto_shutdown.sh &>/dev/null & (in your startup script or manually).

#!/bin/bash

threshold=${1:-1}
intervals=${2:-60}
sleep_time=${3:-60}

function require() {
  if ! which $1 >/dev/null; then
    echo "This script requires $1, aborting ..." >&2
    exit 1
  fi
}
require gcloud
require python3
require curl

if ! curl -s -i metadata.google.internal | grep "Metadata-Flavor: Google" >/dev/null; then
  echo "This script only works on GCE VMs, aborting ..." >&2
  exit 1
fi

COMPUTE_METADATA_URL="http://metadata.google.internal/computeMetadata/v1"
VM_PROJECT=$(curl -s "${COMPUTE_METADATA_URL}/project/project-id" -H "Metadata-Flavor: Google" || true)
VM_NAME=$(curl -s "${COMPUTE_METADATA_URL}/instance/hostname" -H "Metadata-Flavor: Google" | cut -d '.' -f 1)
VM_ZONE=$(curl -s "${COMPUTE_METADATA_URL}/instance/zone" -H "Metadata-Flavor: Google" | sed 's/.*zones\///')

count=0
while true; do
  load=$(uptime | sed -e 's/.*load average: //g' | awk '{ print $3 }')
  if python3 -c "exit(0) if $load >= $threshold else exit(1)"; then
    echo "Resetting count ..." >&2
    count=0
  else
    ((count+=1))
    echo "Idle #${count} at $load ..." >&2
  fi
  if ((count>intervals)); then
    if who | grep -v tmux 1>&2; then
      echo "Someone is logged in, won't shut down, resetting count ..." >&2
    else
      echo "Suspending ${VM_NAME} ..." >&2
      gcloud beta compute instances suspend ${VM_NAME} --project ${VM_PROJECT} --zone ${VM_ZONE}
    fi
    count=0
  fi
  sleep $sleep_time
done

@inossidabile
Copy link

inossidabile commented Nov 12, 2022

#!/bin/bash

threshold=0.1
count=0
wait_minutes=60

while true
do

load=$(uptime | sed -e 's/.*load average: //g' | awk '{ print $1 }') # 1-minute average load
load="${load//,}" # remove trailing comma
ssh_flag=$(ss | grep -i ssh | wc -l)
load_flag=$(echo $load'<'$threshold | bc -l)

if (( $load_flag ))
then
    echo "Idling CPU"
    if ! (( $ssh_flag ))
    then
        echo "Idling SSH"
        ((count+=1))
    else
        count=0
    fi
else
    count=0
fi
echo "Idle minutes count = $count"

if (( count>wait_minutes ))
then
    echo Shutting down
    sleep 300
    sudo poweroff
fi

sleep 60

done

@OptogeneticsandNeuralEngineeringCore
Copy link

Thanks for this! Here is a version that doesn't require bc. It also has flags on top, requiring user input to ensure some understanding (useautoshutdown=false). It also checks if there is a SSH connection (such that you can still use it but then only enable shut down after the SSH is closed). It also looks at GPU usage (via nvidia-smi) in case that is your thing.

#!/bin/bash
# Add to instance metadata with `gcloud compute instances add-metadata \
#   instance-name --metadata-from-file startup-script=idle-shutdown.sh` and reboot
# NOTE: This version does not require `bc`.
# Modified from https://stackoverflow.com/questions/30556920/how-can-i-automatically-kill-idle-gce-instances-based-on-cpu-usage
# ONE Core 5.Jan.2023

# User modification settings
## Flags
useautoshutdown=false # Should this script be used at all?? Default to no. Change to true to use
check_ssh=true        # Flag to enable/disable SSH connection check. Defaults to true, so if a SSH is open, it will not shut down
check_gpu=true        # Flag to enable/disable GPU utilization check
check_cpu=true        # Flag to enable/disable CPU utilization check
## Settings
threshold_cpu=10     # Average (over 1 min) of CPU usage. Defaulted as 10%
threshold_gpu=10      # GPU utilization percentage threshold
wait_minutes=10       # Time, in minutes, that the CPU/GPU usage must be under before the VM is shut down. Note that the script will wait for 30 seconds after this time to shut down to allow the VM to sort itself out a bit.

# Initialization of variables
count=0
ssh_resolution_flag=false
cpu_resolution_flag=false
gpu_resolution_flag=false

# Code
if [ "$useautoshutdown" == false ]; then # Check if useautoshutdown is false, and if so, exit the script
    echo "Auto shutdown is disabled. Exiting script."
    exit 0
else # Else infinate loop this
    while true
    do
    if [ "$useautoshutdown" == true ]; then
        current_time=$(date +"%Y-%m-%d %H:%M:%S")

        if $check_ssh; then
            active_sessions=$(ss | grep -i ssh | wc -l) #who | grep -c "pts/")
            if [ "$active_sessions" == 0 ]; then 
                ssh_resolution_flag=true
                echo "SSH flag set to true." 
                echo "  No SSH detected."
            else # SSH is connected
                ssh_resolution_flag=false
                echo "SSH flag set to true." 
                echo "  Found SSH connections. Number of SSH detected: $ssh_resolution_flag"
            fi
        else # Skip SSH check if the flag is disabled
            ssh_resolution_flag=true
            echo "SSH flag set to false, will not check"
        fi
        
        if $check_cpu; then
            cpu_utilization=$(uptime | sed -e 's/.*load average: //g' | awk '{ printf("%.0f", $1 * 10) }')
            resolution_cpu=$((cpu_utilization < threshold_cpu)) # Set to 0 if CPU utilization is less than the threshold, and 1 if it's not
            echo "CPU flag set to true."
            echo "  At time: $current_time, cpu load: $cpu_utilization %"
            echo "  CPU threshold set to: $threshold_cpu %"
            if [ "$resolution_cpu" -eq 0 ]; then # CPU above thresh
                cpu_resolution_flag=false
                echo "  CPU found to be above threshold. Not idling"
            else # CPU found to be below threshold
                cpu_resolution_flag=true 
                echo "  CPU found to be below threshold. Considered idling"
            fi
        else # Skip CPU check if the flag is disabled
            cpu_resolution_flag=true
            echo "CPU flag set to false, will not check. Considered ideling"
        fi

        if $check_gpu; then
            gpu_utilization=$(nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits)
            resolution_gpu=$((gpu_utilization < threshold_gpu)) # Set to 0 if GPU utilization is less than the threshold, and 1 if it's not
            echo "GPU flag set to true."
            echo "  At time: $current_time, GPU load: $gpu_utilization %" # Set to 0 if GPU utilization is less than the threshold, and 1 if it's not
            echo "  GPU threshold set to: $threshold_gpu %"
            if [ "$resolution_gpu" -eq 0 ]; then
                gpu_resolution_flag=false
                echo "  GPU found to be above threshold. Not idling"
            else # GPU found to be below threshold
                gpu_resolution_flag=true
                echo "  GPU found to be below threshold. Considered idleing"
            fi
        else # Skip GPU check if the flag is disabled
            gpu_resolution_flag=true
            echo "GPU flag set to false, will not check"
        fi

        if $ssh_resolution_flag && $cpu_resolution_flag && $gpu_resolution_flag; then
            ((count+=1))
            echo "Because of settings and observed loads and SSH connections, the VM is considered ideling. Will increase idel time"
            echo "  Time in minutes in idel: $count"
        else # If ANY flag is found to be false, we reset the counter
            count=0
            echo "Because of settings and observed loads and SSH connections, the VM is considered to be working. Will reset the timer"
        fi

        if [ $count -gt $wait_minutes ]; then
            echo "Shutting down. Peace out"
            sleep 5 # wait a little bit more before actually pulling the plug
            sudo poweroff
            fi

        sleep 60 # Sleep for 1 minute to check CPU usage and SSH connection status every minute
    fi
    done
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment