Skip to content

Instantly share code, notes, and snippets.

@akbertram
Last active January 6, 2024 11:19
Show Gist options
  • Save akbertram/a1cf68983cf1f88d25c9e685f639761b to your computer and use it in GitHub Desktop.
Save akbertram/a1cf68983cf1f88d25c9e685f639761b to your computer and use it in GitHub Desktop.
Pair of scripts to start and stop GCE VMs when needed using the Command Launcher Plugin

Intro

These scripts are somewhat complicated way of using a stateful GCE VM as a Jenkins agent.

Rather than creating fresh new instances every time an agent is need, as the GCE Plugin currently does, an existing VM is simply started and stopped as needed.

This can yield significant speed ups in build time, as the VM's persistent disk retains Gradle's dependency caches, previously downloaded Docker images, etc.

If the build system, like Gradle, supports task avoidance, and you are confident enough in your configuration to rebuild without cleaning, then you can gain further speedups without the complications of a distributed cache.

Set up

Copy these scripts to the Jenkin's Controller in /var/lib/jenkins (or somewhere else convenient).

Create a new "Permanent Agent" node.

Select "Launch agent via execution of command on the controller" as the Launch method

Enter the script location, together with the instance zone and name as arguments. For example:

/var/lib/jenkins/start-agent.sh europe-west4-b jenkins-agent-small-01

Screenshot

Note the instance you specify must already be created, though it can be stopped. You'll need to manually (or with Terraform and Packer) set up the instances with Java 11+ and all the required build tools installed.

# Check every minute for instances that can be stopped
* * * * * /var/lib/jenkins/stop-disconnected-agents.sh
#!/bin/bash
ZONE=$1
INSTANCE=$2
JENKINS_HOME=/var/lib/jenkins
# Record our Process ID (PID) in a file in our directory
# in Jenkins home directory. This is used by the stop-disconnected-agents.sh script to
# clean up afterwards.
# Unfortunately Jenkins Command Launcher plugin terminates the script rather than
# sending a TERM signal (https://issues.jenkins.io/browse/JENKINS-50842),
# so we can't rely on trapping the signal here. We instead rely on cron to periodically
# clean up.
echo $$ > $JENKINS/instances/$INSTANCE
echo "Starting instance..."
gcloud compute instances start --zone $ZONE $INSTANCE
if [ $? -ne 0 ]
then
echo "Failed to start $INSTANCE: exit code $?"
exit $?
fi
echo "Copying remoting.jar to agent..."
RETRY_COUNT=0
while true
do
gcloud compute scp --zone $ZONE --internal-ip $JENKINS_HOME/%C/jenkins/war/WEB-INF/lib/remoting-*.jar $INSTANCE:~/remoting.jar
if [ $? -eq 0 ]
then
echo "Successfully connected and updated remoting.jar"
break
elif [ $RETRY_COUNT -gt 5 ]
then
echo "Giving up after $RETRY_COUNT retries."
exit 1
else
echo "Failed to connect, retrying after 10 seconds..."
RETRY_COUNT=$((RETRY_COUNT + 1))
sleep 10
fi
done
echo "Connecting..."
gcloud compute ssh --zone $ZONE --internal-ip $INSTANCE -- 'java -jar remoting.jar -workDir /home/jenkins -jar-cache /home/jenkins/remoting/jarCache'
#!/bin/bash
set -e
# This is a script which runs periodically to stop any instances
# which have been disconnected from Jenkins
JENKINS_HOME=/var/lib/jenkins
ZONE=europe-west4-b
# Note that the path here needs to be hardcoded because cron does not
# provide the complete PATH defined for jenkin's shell
GCLOUD=/snap/bin/gcloud
for INSTANCE in $(ls $JENKINS_HOME/instances)
do
AGENT_PID=$(cat $JENKINS_HOME/instances/$instance)
echo $INSTANCE running on pid $AGENT_PID
if [ ! -e /proc/$AGENT_PID ]
then
echo $INSTANCE no longer running. Stopping...
$GCLOUD compute instances stop --zone $ZONE $INSTANCE
rm /var/lib/jenkins/instances/$instance
echo "Stopped"
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment