PouchDB 3.0.0 will be up to 77% faster than 2.2.3 for secondary index creation. Gains were made across all adapters, with the most improved being IndexedDB in Chrome.
Since the release of PouchDB 2.2.0, a common gripe from developers is that view creation is too slow. This is an unfortunate side effect of the fact that we've reimpemented CouchDB map/reduce in a cross-platform way, without taking advantage of the native secondary indexes in either IndexedDB or WebSQL. (Secondary indexes in LevelDB would look pretty much the same as they do now, since it's just a key-value store.)
That doesn't mean we can't make the current algorithm faster, though. Through performance profiling and pure gumption, we've managed to make some pretty serious gains in all three adapters.
All tests were performed on a 2013 Macbook Air running 10.9.3 Mavericks. For Safari, databases were pre-emptively initialized with 500MB of storage ({size: 500}
) to avoid storage popups that would otherwise throw off the measurements.
First off, we made a big change to the underlying data model: _local
documents are now stored in a separate store (a.k.a sublevel/table/objectStore). This obviates the need to skip them during allDocs()
queries and also has the added benefit of fixing some bugs in replicate()
/changes()
where the database's update_seq
would be advanced too much relative to the seq
of the most recently-posted document revision (since _local
docs would also cause it to increment).
Since _local
docs are heavily used in persisted map/reduce, though, this also prompted the following improvements in the temp-views
test, which is a nice test because it tests many things at once (bulkDocs
/allDocs
/mapreduce) and because it's slow enough that we get a nice granularity when making comparisons.
Node (LevelDB) - tested 3 times
Before: 91751ms, 85674ms, 83599ms
After: 65897ms, 65397ms, 74604ms
Chrome (IndexedDB): 613964ms -> 597363ms
Safari 7: 186786ms -> 124856ms
This amounts to a 21.11% improvement in LevelDB, 2.7% in Chrome IndexedDB, and 33.15% in Safari WebSQL.
Next up, the following changes were introduced:
- A single transaction is used per batch update in map/reduce (
allDocs()
queries withkeys
), in both WebSQL and IndexedDB. (pouchdb#2394) - During batch updates in map/reduce, nothing is fetched from the main store if we're putting a new document (i.e.
metaDoc.keys
is empty). (mapreduce#194) - During
put()
/bulkDocs()
operations, document revisions are optimistically inserted and then only updated if the database signals a constraint violation. (This is an extremely rare edge case that was prompting us, before, to always fetch the document revision in advance in order to check.) (pouchdb#2391 and pouchdb#2392) - Rather than storing the entire key/value/id as an object in the database, we only store the value if it's specified, and otherwise we store an empty object, since we can just parse the
indexableString
when we pull the document out in order to get the original doc ID and key. This has the added benefit of using up less space on disk. (mapreduce#191)
Many outstanding pull requests were combined together in order to test these changes. The branches compared are here: before (this, this, and this) and after (this, this, and this).
Results are as follows:
leveldb (node)
65742ms -> 63161ms (4% improvement)
idb chrome 35
863745ms -> 198207ms (77% improvement)
websql safari 7
119757ms -> 91749ms (23.3% improvement)
firefox 30
687842ms -> 541505ms (21.3% improvement)
Clearly IndexedDB on Chrome is most improved, which is nice given that Chrome didn't benefit much from the previous changes. Firefox was not tested before, but it's definitely getting faster with the second set of changes.
We're getting there. Once we've knocked down the low-hanging fruit like these, then the big gains will probably only come from something like native secondary indexes.