Skip to content

Instantly share code, notes, and snippets.

@mikepea
Last active August 29, 2015 14:08
Show Gist options
  • Save mikepea/71b613676e5cfdd6fe03 to your computer and use it in GitHub Desktop.
Save mikepea/71b613676e5cfdd6fe03 to your computer and use it in GitHub Desktop.
MongoDB upgrade plan

Mongo Migration Plan

Pre-migration work

This can be performed at any time prior to the migration, but as running Skyscape instances will be provisioned in the process it makes sense to do so when we have committed to a migration date.

  • Switch current mongo nodes over to vXXX (TODO) mongodb-formula, run update. This will reduce the management of the existing mongodb nodes, but still allow us to affect change on them.

  • Ensure we have graphite visibility of 500 errors, 499 errors, site traffic.

  • Add mongo-04, mongo-05, mongo-06 nodes:

    • via vcloud-launch to provision
    • to project.yaml
    • with 'mongodb', 'mongodbnew' roles initially.
  • Ensure hourly backups are working on all mongo nodes.

  • Test a recovery as per the docs onto mongo-04+

  • Ensure communications from aaa-01 to mongo-04+ are open.

  • Ensure we can scp from mongo-01 to mongo-04

  • Reset the databases on mongo-04+ prior to migration:

sudo stop mongod;
sudo rm -rf /data/mongodb
sudo salt-call state.highstate
sudo mongo_initiate_replica_set -u admin -p {password} rs0 mongodb-04.local
  • Lower TTL for DNS A records down to 5 seconds if poss (otherwise, lowest time permissible)

Actual migration

08:00 -- Switch to maintenance page; kill site.

  • switch DNS -- as per the support docs.
  • ensure no traffic hitting site, tail LB logs.
# on aaa-01
sudo /etc/init.d/supervisor stop
sudo killall php-fpm

08:05 -- Dump old data; switch pillar data.

  • mongodump on 2.0 master, xfer to 2.6 master
# on mongo-04
rm -rf dump
mongodump -u admin -p {password}
tar -zcvf dump.tgz dump/
scp dump.tgz mongodb-04.local:
  • In the meantime, update pillar services.sls mongodb.databases.{app}.servers to be 04-06 (rather than 01-03)
  • And update with the new app user passwords

08:10 -- Start restore

  • scp dump files onto 2.6 master; untar
rm -rf dump
tar -zxf dump.tgz
  • Recover the databases:
# NB: Restore should take ~10mins for production, much quicker on staging/scratch.
cd dump
mongo_restore_database -u admin -p {admin-password} -d . $( find . -name \*.bson | grep -v backup | grep -v Clone | grep -v system )
  • Enable the replica set secondary nodes:
sudo mongo_initiate_replica_set -u admin -p {password} -a rs0 mongodb-04.local mongodb-05.local mongodb-06.local
  • Confirm that rs0 has started up successfully:
sudo mongo -u admin -p {admin-password} --eval 'printjson(rs.status())' admin
  • Run salt highstate to add indexes and users
fab zone:{zone} workon:master update

08:25 -- Bring up application on new mongo cluster.

  • Confirm applications are pointed to mongo-04+ in pillar data (services.sls) with NEW passwords

  • Update the php configs for all our applications (by doing a 'fake deploy' of the current release)

fab zone:{zone} workon:master rsync
fab zone:{zone} workon:master set_build:BUILD-2014-11-05-release-32-1 deploy
  • Test with local hosts file override of DNS.
185.40.8.196 www.lastingpowerofattorney.service.gov.uk lastingpowerofattorney.service.gov.uk 

08:40 -- Switch DNS back

  • Confirm site ok with hosts file entry removed

  • Observe traffic:

    • 500 errors
    • 499 errors
    • logins

Rollback plan

In the event of a serious issue discovered before the site is handed back to the users:

# Reset the pillar data back to mongodb-01..03 and the original passwords
fab zone:{zone} workon:master rsync
fab zone:{zone} workon:master set_build:BUILD-2014-11-05-release-32-1 deploy

If the rollback is required after users have been let back on, the procedure is the same except a recovery of hte data from mongo-04 will need to be recovered onto mongo-01, after first clearing the mongo-01..03 data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment