Skip to content

Instantly share code, notes, and snippets.

@dustin
Created September 8, 2010 00:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dustin/267d6fd0528be71aa58d to your computer and use it in GitHub Desktop.
Save dustin/267d6fd0528be71aa58d to your computer and use it in GitHub Desktop.
Speeding up membase startup

Fasten Your Startups

1 Overview

Membase has two startup modes currently:

  1. Start instantly and load data in the background.
  2. Load all of the data and then start.

The downside of the first way is mass confusion when there are gets which should be there, but return misses, adds that work when they shouldn’t, incrs initializing values that exist, etc…

The downside of the second way is that all data must have been read from disk before we can start servicing requests. That could be a while.

We really want something right between #1 and #2.

2 Speeding up by Pausing

The basic concept of these speeds ups is to run the operations we know are safe and pause all other connections until we can safely complete them.

2.1 Lowest Hanging Fruit

The first thing we can do is to identify which commands do not absolutely require data to be present.

For example, set (with no CAS identifier) will just overwrite whatever is currently stored. That command is always allowed.

Known commands that can execute unconditionally:

  • SET/SETQ
  • QUIT/QUITQ
  • FLUSH/FLUSHQ
  • SASL*
  • VERSION
  • STAT
  • NOOP
  • Setting vbucket state (but not deleting vbuckets)

2.2 Reaching Slightly Higher

There are also commands that can conditionally execute. For example, get can work iff the data is present. It must not return a miss when we’re still loading data from disk because we don’t know whether we have data or not.

When we don’t know whether something exists, we will pause the connection and wait for the load to complete before proceeding. This means that a miss right at the start of the engine will block until the entire warmup completes (to be optimized later).

The following commands should be fairly easy to implement as conditionally executable commands during startup:

  • GET/GETQ/GETK/GETKQ
  • REPLACE/REPLACEQ
  • INCREMENT/INCREMENTQ
  • DECREMENT/DECREMENTQ
  • APPEND/APPENDQ
  • PREPEND/PREPENDQ
  • DELETE/DELETEQ

2.3 Then the Really Hard Ones

There are a few remaining commands that are possible, but are still yet harder because they require more infrastructure than just do or wait.

Commands that would behave differently if executed after future data loads must either wait for data to load or have persistent instructions to take effect.

For example, FLUSH/FLUSHQ requires all future load data to be invalidate, but future wire data not to be. That’s a bit tricky to get going without some code changes.

Similarly, ADD/ADDQ can fail fast, but can’t possibly succed until the very end (and needs to know the difference).

2.4 The Too Hard to Bother

Tap related commands, for example, are best left to the end of the data load.

3 Starts Fast, Now Let’s Go Fast

The above cases provide a fast startup for the server, but can appear to be quite slow for individual connections.

3.1 Key Monitoring

Individual key monitoring infrastructure which is useful for the sync command will allow a command to be retried the moment a key arrives from disk.

3.2 Separate Metadata and Full Data Fills

Grabbing just the key and length should get us to a steady state very quickly, followed by a backfill of hot values (periodically snapshotted as well as primed by incoming requests).

3.3 And More!

There’s likely to be more speedups at this point, but we’ll care less about them because this will be fast enough that we’ll find pain elsewhere in the system that requires our focus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment