Skip to content

Instantly share code, notes, and snippets.

@revett
Last active November 7, 2016 11:36
Show Gist options
  • Star 16 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save revett/99d8a5143c0bfeddfc92 to your computer and use it in GitHub Desktop.
Save revett/99d8a5143c0bfeddfc92 to your computer and use it in GitHub Desktop.
Tutum Zero Downtime Re-deploy

Tutum Zero Downtime Re-deploy

cats.jpg

I tweeted Tutum last night asking if they're looking at implementing zero downtime re-deploys for a given service. Slightly surprised by their response as it seems like a critical feature if you want to use the service for a production environment.

"not a top priority, but by Spring :)"

As Tutum currently doesn't support graceful termination of containers within a service, I was experiencing a 5-10 second window of 503 errors, so decided to use the following hack (code below) until the feature is officially implemented.

Solution

  1. Create two identical web app services.
  2. Link HAProxy service to both.
  3. Redeploy one service successfully.
  4. Wait 10 seconds.
  5. Redeploy the second service.

Note - use Tutum's HAProxy image so that it re-configures to new containers, and set the POLLING_PERIOD environment variable to 1.

Example Tutum Setup

tutum_setup.jpg

Notes

  • Could instead have a single service and redeploy the individual containers.
  • The code could be better, would be better in a Thor app.
  • Tutum is awesome!
require 'tutum'
module TutumHack
class ZeroDowntime
def initialize(service_names)
@service_names = service_names
end
def start!
ids.each do |id|
redeploy id
until service_running? id
print '.'
sleep 1
end
puts 'Finished!'
sleep 10
end
end
private
attr_reader :service_names
API_KEY = '123456789'
USERNAME = 'batman'
def ids
services.map do |service|
service['uuid'] if known_service? service['name']
end.compact
end
def known_service?(name)
service_names.include? name
end
def redeploy(id)
print "Re-deploying: #{id}"
tutum.services.redeploy id
end
def service(id)
tutum.services.get id
end
def services
tutum.services.list['objects']
end
def service_running?(id)
service(id)['state'] == 'Running'
end
def tutum
@tutum ||= Tutum.new(USERNAME, API_KEY)
end
end
end
service_names = ['batcomputer-1', 'batcomputer-2']
TutumHack::ZeroDowntime.new(service_names).start!
@agonbina
Copy link

agonbina commented Feb 5, 2015

Nice hack! Do you know if the HAProxy service stops sending transactions to a Service immediately as it enters 'terminating' mode?

@revett
Copy link
Author

revett commented Feb 5, 2015

Thanks 👍

HAProxy will only remove a service from it's config when a POLL fails, so I set the POLLING_PERIOD environment variable to 1. One way around this is to first remove the link to a service from HAProxy, then re-deploy the service and then re-link the service.

@borjaburgos
Copy link

We've changed how redeploy works in Tutum to minimize downtime.... containers are now redeployed sequentially (a container doesn't go to terminated until another one is starting). I tried it using this script

#!/bin/bash
COUNTER=0
while :
do
    if curl -s --head http://lb-1.borjaburgos.cont.tutum.io | grep "200 OK" > /dev/null
    then
        echo "Up!" > /dev/null
    else
        echo Downtime! on Request $COUNTER
    fi
    let COUNTER=COUNTER+1
done

with a load-balanced service running 5 containers, and only had 2 requests experience downtime. Not a perfect solution, but a step closer to 0-downtime. Hope that helps!

@revett
Copy link
Author

revett commented Feb 22, 2015

Sweet! Is this setup by default for every multi-container service?

@borjaburgos
Copy link

Yes, default. In the future we may allow for different redeployment strategies, but for now this is the default. It's also much faster now thanks to our dynamic linking + overlay networking (more info coming soon). We finally kissed goodbye to ambassadors.

Try it out and let us know any issues. There may be some hiccups if the app (inside the container) takes more than a handful of seconds to be ready after container starts.

Thanks for feedback and support during our Beta!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment