dustin/fasten.org Secret

## fasten.org

      
    Raw
  

              fasten.org
            
          
    Fasten Your Startups

1 Overview

Membase has two startup modes currently:

  Start instantly and load data in the background.
  Load all of the data and then start.

The downside of the first way is mass confusion when there are gets
  which should be there, but return misses, adds that work when they
  shouldn’t, incrs initializing values that exist, etc…
The downside of the second way is that all data must have been read
  from disk before we can start servicing requests.  That could be a
  while.
We really want something right between #1 and #2.
2 Speeding up by Pausing

The basic concept of these speeds ups is to run the operations we know
  are safe and pause all other connections until we can safely complete
  them.
2.1 Lowest Hanging Fruit

The first thing we can do is to identify which commands do not
  absolutely require data to be present.
For example, set (with no CAS identifier) will just overwrite
  whatever is currently stored.  That command is always allowed.
Known commands that can execute unconditionally:

  SET/SETQ
  QUIT/QUITQ
  FLUSH/FLUSHQ
  SASL*
  VERSION
  STAT
  NOOP
  Setting vbucket state (but not deleting vbuckets)

2.2 Reaching Slightly Higher

There are also commands that can conditionally execute.  For
  example, get can work iff the data is present.  It must not return a
  miss when we’re still loading data from disk because we don’t know
  whether we have data or not.
When we don’t know whether something exists, we will pause the
  connection and wait for the load to complete before proceeding.  This
  means that a miss right at the start of the engine will block until
  the entire warmup completes (to be optimized later).
The following commands should be fairly easy to implement as
  conditionally executable commands during startup:

  GET/GETQ/GETK/GETKQ
  REPLACE/REPLACEQ
  INCREMENT/INCREMENTQ
  DECREMENT/DECREMENTQ
  APPEND/APPENDQ
  PREPEND/PREPENDQ
  DELETE/DELETEQ

2.3 Then the Really Hard Ones

There are a few remaining commands that are possible, but are still
  yet harder because they require more infrastructure than just do or
  wait.
Commands that would behave differently if executed after future data
  loads must either wait for data to load or have persistent
  instructions to take effect.
For example, FLUSH/FLUSHQ requires all future load data to be
  invalidate, but future wire data not to be.  That’s a bit tricky to
  get going without some code changes.
Similarly, ADD/ADDQ can fail fast, but can’t possibly succed until
  the very end (and needs to know the difference).
2.4 The Too Hard to Bother

Tap related commands, for example, are best left to the end of the
  data load.
3 Starts Fast, Now Let’s Go Fast

The above cases provide a fast startup for the server, but can appear
  to be quite slow for individual connections.
3.1 Key Monitoring

Individual key monitoring infrastructure which is useful for the sync
  command will allow a command to be retried the moment a key arrives
  from disk.
3.2 Separate Metadata and Full Data Fills

Grabbing just the key and length should get us to a steady state very
  quickly, followed by a backfill of hot values (periodically
  snapshotted as well as primed by incoming requests).
3.3 And More!

There’s likely to be more speedups at this point, but we’ll care less
  about them because this will be fast enough that we’ll find pain
  elsewhere in the system that requires our focus.