Skip to content

Instantly share code, notes, and snippets.

@robwilson1
Last active November 3, 2017 14:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save robwilson1/4158af10797127efc03fd982ad389cf8 to your computer and use it in GitHub Desktop.
Save robwilson1/4158af10797127efc03fd982ad389cf8 to your computer and use it in GitHub Desktop.

Storage Engines in MongoDB

  • v3.0 - introduced plugable storage engines (allowing switching of engines)
  • v3.2 - WiredTiger became default storage engine

The storage engine is directly responsible for Data file format and ithe format of indexes. Storage engine dos not affect the API that the programmer uses, nor does it have anything to do with inter server communication on replica sets.

MMAPv1

https://docs.mongodb.com/manual/core/mmapv1/#

MMAP will control when the your app needs to access data, if the data exists in memory. If yes then it will return the data from memory otherwise it will fetch the data from the disk, place it into memory and then return the data to the app.

MMAP offers collection level concurrency AKA collection level locking. Each collection has its own file if you looked in /data/db. Ergo if you have two operatiosn at the same time on the same collection then one would have to wait for the other to conclude if the operations are writes. MMAP is multiple reader single writer.

MMAP can perfrom in place updates on documents in memory. In order to make it more likly to do this instead of having to move it to an area of memory with more space, MMAP uses 'power of two' allocations where a document is padded to the closest size. E.g a 34 byte document will be padded to 64 bytes and a 12KB file will be padded to 16KB etc. https://docs.mongodb.com/manual/core/mmapv1/#power-of-2-allocation

MMAP uses journaling to reduce load on the disk by setting what operations occured inside a special journal file. This file is stored to disk every 100ms and the documents themselve are written to disk every 60 seconds. https://docs.mongodb.com/manual/core/journaling/#journaling-and-the-mmapv1-storage-engine

WiredTiger

https://docs.mongodb.com/manual/core/wiredtiger/

WiredTiger offers document level concurrency. Unlike MMAP there is no locking involved as it is assumed that two cuncurrent writes are not applied to the same document. If there is two writes to the same document then one of the writes is unwound and has to be retried.

WiredTiger offers compression both on the data and the indexes. It is not compressed in memory for performance reasons but it compressed before being stored on disk.

WiredTiger is an append only storage engine, so unlike MMAP it does not offer in place updates. However, this is what offers a lock free environment and this typically makes it faster than MMAP.

WiredTiger also makes use of journaling but makes use of checkpoints and has a lot more info that you can see here: https://docs.mongodb.com/manual/core/journaling/#journaling-and-the-wiredtiger-storage-engine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment