Yesterday I upgraded our running elasticsearch cluster on a site which serves a few million search requests a day, with zero downtime. I've been asked to describe the process, hence this blogpost.
To make it more complicated, the cluster was running elasticsearch version 0.17.8 (released 6 Oct 2011) and I upgraded it to the latest 0.19.10. There have been 21 releases between those two versions, with a lot of functional changes, so I needed to be ready to roll back if necessary.
-
elasticsearch
We run elasticsearch on two biggish boxes: 16 cores plus 32GB of RAM. All indices have 1 replica, so all data is stored on both boxes (about 45GB of data). The primary data for our main indices is also stored in our database. We have a few other indices whose data is stored only in elasticsearch, but are updated once daily only. Finally, we store our sessions in elasticsearch, but active sessions are cached in memcached.
-
web servers
Our web servers sit behind a pound load balancer, which means that I can disable each web server one by one, restart it and reenable it, without affecting my users.
-
ElasticSearch.pm
My app uses the Perl API ElasticSearch.pm to talk to the cluster. ElasticSearch.pm, when it connects to a cluster, sniffs all running nodes and round robins between them. It refreshes its known server list every 10,000 requests. If a node disappears, it refreshes the known server list and retries the request on a live node.
-
Shut down node es2
The app processes all mark es2 as disappeared, and failover to using just es1
-
Back up the data dir on es2
Just to be sure that we can rollback if things go badly wrong
-
Rename the cluster on es2
I want to start es2 without it talking to es1, so we change the cluster name (from
cluster_0178
tocluster_01910
). This meant editing thecluster.name
in theconfig.yml
file, and renamingdata/cluster_0178
todata/cluster_01910
. -
Change the HTTP port on es2
I want to start es2 but ensure that my application doesn't try to connect to it (my app is configured to use
es1:9200
andes2:9200
, so I addedhttp.port: 9250
inconfig.yml
-
Clear out the transaction log
Version 0.17.8 uses Lucene 3.4. Version 0.19.10 uses Lucene 3.6. Lucene should upgrade indexes correctly, but there is a note on the 0.19 releases that we should clear the transaction log (ie flush all indices) before upgrading. So I start up es2 (which I can now do without es1 or my app trying to talk to it) and run:
curl -XPOST 'http://127.0.0.1:9200/_flush?pretty=1'
-
Upgrade es2 to 0.19.10
I shut down es2, downloaded 0.19.10 and installed it (keeping the
cluster.name
andhttp.port
settings from above). I restarted es2 and elasticsearch automatically upgraded my indices. es1 showed some errors about not being able to talk to es2, but that's to be expected because they are (very!) different versions. No harm done - I didn't want them to talk anyway. -
Copy the data from es2 to es1
When a new node connects to a cluster, it will copy the data from the running cluster. That said, we have 45GB of data that needs copying, which is a lot of IO. So I decided to copy it over myself with
rsync
so that the majority of the data would already be there when I start the new version on es1. Startrsync
. Wait 2 hours... -
Update the data in es2
I've been running elasticsearch since version 0.04, when it was much less reliable than it is today. So I have a script which compares the data in my database to the data indexed in elasticsearch. For our main (search only) indices, I could use this to make sure that the data on es2 was up to date.
However, my sessions are only stored in elasticsearch, so I needed to pull any more recent sessions from es1 and store them in es2. This I did with a simple Perl script:
my $es1 = ElasticSearch->new( servers => 'es1:9200' ); my $es2 = ElasticSearch->new( servers => 'es2:9250' ); my $source = $es1->scrolled_search( index => 'session', search_type => 'scan', size => 500, queryb => { -filter => { last_modified => { 'gte' => '2012-10-13 09:00:00' } } } ); $es2->reindex( source => $es1 );
I could repeat the above scripts to keep es2 up to date.
-
Prepare my app to use es2
I now have two separate clusters: one running on es1 and one on es2. I'm keeping the data synced in both using an external process. I now need to move my app from es1 to es2. I change my app config to talk to
['es1:9200','es2:9250']
, but I don't restart the webservers just yet.es2 has still not been used, so is has no active caches. We need to warm it up a bit first. I do this by restarting one web server, letting it run for a few requests, then disabling it. Wait a little, reenable it, etc, until es2 has successfully warmed up its caches. Now we're ready to go.
One by one I restart each webserver, until all processes are talking to both es1 and es2. At the same time, I'm running the data-sync scripts to make sure that both nodes have a similar copy of the data. (Note: the session data probably became a bit inconsistent, as I was doing one-way sync only. But because the current sessions are cached in memcached, this was not a problem).
-
Upgrade es1
I could now shutdown es1, and let my app failover to use es2 only. Then I upgraded es1 to 0.19.10, updated the
cluster.name
and moved the data dir that I had rsynced from es2 into positiondata/cluster_01910
.I restarted es1, it connected to es2 and copied the changed segments across to bring the data on es1 up to date - because most of the data was already there, this process took just a few minutes. There was a bit of a slowdown as es1 warmed its caches, but not too bad. The site kept on running.
-
Finally
I could have left es2 running on HTTP port 9250, but to put everything back to the way it was, I removed the `http.port` line from es2's `config.yml` and restarted es2. My app failed over to use just es1, then refreshed the server list and started using es2 again, but this time on port 9200.
Then I reverted my app config to use `['es1:9200','es2:9200']` and restarted each process one by one.
This is a long and careful upgrade process, but I doubt that any of my even users noticed any change.
The two most difficult parts of the upgrade are:
- keeping data in sync, and
- nodes from different versions can't talk to each other
There are two changes in the works which should greatly improve the process. First is the "changes" stream, which will make keeping data in sync much easier. This change should also allow us to reindex data to an index with a different configuration, while still pulling changes from the old index, and even altering the old data on the fly before indexing to the new index.
The second is that kimchy has announced that work is being done to allow rolling upgrades on live clusters!