Membase has two startup modes currently:
- Start instantly and load data in the background.
- Load all of the data and then start.
The downside of the first way is mass confusion when there are gets which should be there, but return misses, adds that work when they shouldn’t, incrs initializing values that exist, etc…
The downside of the second way is that all data must have been read from disk before we can start servicing requests. That could be a while.
We really want something right between #1 and #2.
The basic concept of these speeds ups is to run the operations we know are safe and pause all other connections until we can safely complete them.
The first thing we can do is to identify which commands do not absolutely require data to be present.
For example, set
(with no CAS identifier) will just overwrite
whatever is currently stored. That command is always allowed.
Known commands that can execute unconditionally:
SET/SETQ
QUIT/QUITQ
FLUSH/FLUSHQ
SASL*
VERSION
STAT
NOOP
- Setting vbucket state (but not deleting vbuckets)
There are also commands that can conditionally execute. For
example, get
can work iff the data is present. It must not return a
miss when we’re still loading data from disk because we don’t know
whether we have data or not.
When we don’t know whether something exists, we will pause the connection and wait for the load to complete before proceeding. This means that a miss right at the start of the engine will block until the entire warmup completes (to be optimized later).
The following commands should be fairly easy to implement as conditionally executable commands during startup:
GET/GETQ/GETK/GETKQ
REPLACE/REPLACEQ
INCREMENT/INCREMENTQ
DECREMENT/DECREMENTQ
APPEND/APPENDQ
PREPEND/PREPENDQ
DELETE/DELETEQ
There are a few remaining commands that are possible, but are still yet harder because they require more infrastructure than just do or wait.
Commands that would behave differently if executed after future data loads must either wait for data to load or have persistent instructions to take effect.
For example, FLUSH/FLUSHQ
requires all future load data to be
invalidate, but future wire data not to be. That’s a bit tricky to
get going without some code changes.
Similarly, ADD/ADDQ
can fail fast, but can’t possibly succed until
the very end (and needs to know the difference).
Tap related commands, for example, are best left to the end of the data load.
The above cases provide a fast startup for the server, but can appear to be quite slow for individual connections.
Individual key monitoring infrastructure which is useful for the sync command will allow a command to be retried the moment a key arrives from disk.
Grabbing just the key and length should get us to a steady state very quickly, followed by a backfill of hot values (periodically snapshotted as well as primed by incoming requests).
There’s likely to be more speedups at this point, but we’ll care less about them because this will be fast enough that we’ll find pain elsewhere in the system that requires our focus.