Skip to content

Instantly share code, notes, and snippets.

@mblair
Last active September 2, 2015 22:25
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mblair/748064f461c0950cb7d6 to your computer and use it in GitHub Desktop.
Save mblair/748064f461c0950cb7d6 to your computer and use it in GitHub Desktop.
An Operator's Look at RethinkDB

An Operator's Look at RethinkDB

Matt Blair - @mattyblair - Flipboard

The Getting Started Experience

  • It's in Homebrew and just runs (relative to, say, HBase): awesome!
  • Having packages for popular Linux distros (I use Ubuntu personally and at work): awesome!
  • Serving the GPG key over HTTP: not so awesome :-/
  • apt-get -y install rethinkdb
  • It didn't start by default- great! I hate when services do this. I then looked at the start script to see how it knew (usually daemons use /etc/default/blah, but it has custom logic to see if there's anything in /etc/rethinkdb/instances.d, which is cool).
  • I then wondered if it would restart upon upgrade, so...
  • I looked at the default conf; it's just comments, sweet that it starts without having to configure it!
  • Also, it's SHORT. I'm sure it'll grow over time, but compared to, say, Cassandra (~900 lines, including comments and blanks), this is awesome!
  • It's ini-style; does the last value win for duplicate keys? It's often handy to just append and keep it moving, instead of having to write sed invocations.
  • cd /etc/rethinkdb && cp default.conf.sample instances.d/
  • sudo service rethinkdb start
  • Oh cool, it mentioned where it's listening.
  • echo "bind=all" > /etc/rethinkdb/instances.d/default.conf
  • apt-get -y install rethinkdb=2.0.1
  • Couldn't be found :-(
  • apt-cache show rethinkdb | head -n25
  • ah, add ~0trusty
  • apt-get -y --force-yes install rethinkdb=2.0.1~0trusty
  • Oh, it does restart upon upgrades :-( - less than ideal because I often stage upgrades by installing binaries everywhere, then doing rolling restarts.
  • The source compilation steps say to run ./configure --allow-fetch; What is it fetching? Some shops don't allow machines with compilers to access the internet, and vice versa. It'd be good to enumerate what dependencies can be staged ahead of time.

Ops Documentation Questions

  • The memory usage page talks about expecting "each query and background process to use 1-20MB of memory"; what about connections? MySQL has memory overhead for each open connection, which can become a bottleneck. Is this an issue for RethinkDB?
  • Hot backup- "it will use some cluster resources, but it will not lock out any of the clients, so you can safely run it on a live cluster." A little more detail would be great here; HBase 1.1 has added request throttling, so folks doing scripted backups or analytical queries (for example) can throttle by table or user. I don't think request throttling is needed, but some more information about how the hot backup job affects interactive queries would be great.
  • Monitoring- system info is just a RethinkDB table, no reporters :-(
    • Important for integration into existing tools (Nagios, OpenTSDB, Graphite, Riemann...); to write an integration, you need a RethinkDB client.
    • JSON over HTTP would be cool!
    • What to monitor? If I wanted to throw together a RethinkDB dashboard, what metrics would I choose? Riak has a 'Riak Metrics to Graph' section of their docs, for example.
    • Latency numbers, not just throughput! (Riak has median/95/99/max)
  • Version migration- can it be done in a rolling fashion? The docs don't say either way.
  • systemd support appears to be in progress- Ubuntu 16.04 will (probably) use it by default, so having an answer here would be cool.
  • Log syntax docs?
  • Log rotation- built in? No? (no is just fine, ops people know how to use logrotate)
  • Log levels? Are they configurable? Without restarting?

Clustering

  • I haven't tested it because I'm waiting for the Raft stuff to land in 2.1 to really give it a workout.

Other random thoughts

  • 'Writing RethinkDB drivers'- very well written...now I might have to write one :-)
  • Mike mentioned deletes during his talk- are they soft deletes, or real deletes? Since it's log-structured, I'm guessing they're soft and compaction needs to occur for the space to be freed?
  • Is there a preferred filesystem? EXT4/XFS/ZFS?
  • What do latency and throughput look like when the data size is larger than the amount of memory you've allocated to RethinkDB? I get that it's not a memory-first DB like MongoDB, but a quick answer here would alleviate the inevitable "is this thing web scale?" questions.
  • Official benchmarks, for both bare metal and AWS. I know that Scientific Benchmarking™ is hard, so even just code users can run themselves to do their own stress tests would be great.
  • AWS instance recommendations! Surely it'll run well on i2s, but what about r3s? c3s? EBS with PIOPS?
  • Spreading over multiple disk volumes? "Note: it is possible to attach more specialized EBS volumes and have RethinkDB store your data on them, but this option is not yet available out of the box." (http://rethinkdb.com/docs/paas/) Cassandra has a JBOD mode where they'll spread the data across volumes; striping (or RAID10) is fine for now.
  • For backup/restore, if the restored node has a different FQDN, do you need to mark the old node as "down" before bringing the new one up? Does DNS matter at all (please say no)?
  • What happens when disks die, or fill up? Can reads still occur when disks are full? Cassandra has a blog post about this (http://www.datastax.com/dev/blog/handling-disk-failures-in-cassandra-1-2), some answers here would be great.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment