sagivf/RethinkDB count issue and solutions.md

## RethinkDB count issue and solutions.md

      
    Raw
  

              RethinkDB count issue and solutions.md
            
          
    Problem

Count() is O(n).
This can send a new developer running to the hills, as it seems like a trivial problem, however it is not.
While we hope this gets addressed in the future (even in a non ideal way), there are work arounds.
Relevant Issues:

rethinkdb/rethinkdb#5894
rethinkdb/rethinkdb#2411
rethinkdb/rethinkdb#3949
rethinkdb/rethinkdb#3384
rethinkdb/rethinkdb#1271

Solutions


Use the tables info command if an estimate is enough -

r.db('DB').table('TABLE').info()('doc_count_estimates').nth(0)


Upgrade your cluster: A sharded cluster with strong servers (SSD, memory, etc) helps a lot.
You can also increase --cache-size.


Add a table that saves your counts. You can:


increase on every insert
use a changefeed, prefarbly with a squash
just save the count result now and a again.


add a "position/i/inesrted" field to the table and mantain in memory on inserts. That way the last record sorted by index has the count as it's "position/i/inesrted" propery.

Comment

To the best of my knowledge, if the bulk of your work is with processing tables with millions of rows and analizing them RethinkDB is probably not your best solution. You could also combine it with another DB.