timblair/ayb12.md

## ayb12.md

      
    Raw
  

              ayb12.md
            
          
    AYB12

Alvin Richards: MongoDB


Trade-off: scale vs. functionality.  MongoDB tries to have good
functionality and good scalability.
Auto-sharding to maintain equilibrium between shards
Scalable datastore != scalable application: use of datastore may still
be non-scalable (e.g. many queries across all shards)
Get low latency by ensuring shard data is always in memory: datastore
then becomes a cache with persistence
Replica sets: auto-election of new primary node on failure, plus
automatic recovery once failed node is back online
Async replication between nodes in a replica set (eventual
consistency)
Auto TTL for messages, and can update on read operations
Tunable data consistency before write is "complete" from "none": fire
and forget, assume it's going to get there eventually, to "full":
includes remote replication to other geographies
Data model of RDBS enforces relational model which can limit ability
to scale that system.  Data locality ("which server is my record on?")
becomes an issue

Luca Garulli: OrientDB


Biggest issue with switching from RDBMS: what about the data model?
KV, column-based, document DBs ... and graph DBs
Property graph model: vertices and edges can have properties, edges
are directional, edges connect vertices, vertices can have one or more
incoming + outgoing edges
In RDBMS, every time you traverse a relationship, you perform an
expensive JOIN.  Indexes can speed up reads, but slow down writes
Index lookups are generally based on balanced trees.  More entries ==
more lookup steps == slower JOIN
"A graph DB is any storage system that provides index-free adjacency"
A graph DB treats relationships as physical links assigned to the
record when the edge is created; RDBMS computes the same relationship
every time you perform a JOIN
Lookup time moves from O(log N) to new O(1), and does not increase
with DB size
NuvloaBase.com: REST-based graph DB service
Difficult to create distributed graph DBs.  Scaling is basically a
case of using client-side hashing.

Dale Harvey: PouchDB


CouchDB for JavaScript environments, mainly for browsers (but also
works in Node.js)
Multi-master replication, supports disconnected sync
"Ground computing" -- like cloud computing, but provides offline
behaviour with on-demand sync
Designed for builing applications that needs to work well offline, and
that need to sync data
Would simplify something like multi-app SimpleNote-type system?
Offline is a fact: the more mobile devices, the more people are
offline.  No reception data limits, slow / unstable connections etc
Sync is hard: Things took 2 years to develop sync
Bad connections + retries, transfer overhead and moving deltas (mobile
access might not want total sync), master-master scenarios, conflict
resolution
[CP]ouchDB has good, simple conflict resolution, but sometimes you
need to tell it what to resolve (based on your app usage)
Requires CouchDB on the server for sync
Safari + Opera support in progress, so not production-ready yet

Matt Heitzenroder: Eventual Consistency


Brewer's Conjecture (2000): CAP -- you can only have two
"Life is full of tradeoffs" as is engineering
Amazon's Dynamo paper: tradeoff between C & A -- they chose A
Financial systems already dealing with eventual consistency: trading
banks closing and reconciling, network partitions between cash point
and centralised bank etc
Riak uses vnodes in a ring topology (ketama-style)
Writes go to hashed node + the next two (i.e. three copies on separate
nodes)
Read Repair: handle out of date copies of data on vhosts automatically
on read and update out of date nodes to logical descendants (e.g. v1
-> v2)
Read Repair etc means internally three objects are requested and
checked for consistency.  This can be tuned via quoram, single-read
for speed etc
There can be divergent ojbect versions, a.k.a. siblings: after a
network partition, two operations can have altered object state at the
same time.  Riak returns both versions
Per-application, can define a "conflict resolver": as part of the Riak
client to define how to handle sibling resolution
Common use-cases are: pick one based on some property, or perform a
set union of the data
Probabilistically Bounded Staleness

Monty Widenius: MySQL-MariaDB


MySQL named after Monty's daughter, My (MaxDB released later, named
after his son, Max)
Original MySQL devs started focussing on MariaDB in 2009 with the
impending purchase of Sun by Oracle
Chose to use dual-license to be able to work full-time on MySQL: took
2 months to become profitable
Don't go to investors when you need their money.  Wait for them to
come to you when you don't need their money, and you won't have to
give up so much of your company
Monty Program Ab: new company (using Hacker Business Model) to focus
on MariaDB, with most of the original MySQL developers
Aim to keep MySQL dev talent together, always have an open-source
version of MySQL.  More important after Oracle purchase of Sun
MariaDB is a drop-in replacement for MySQL.  "No reason to use MySQL
anymore: MariaDB is better in all cases"
Big JOIN and subquery performance is an order of magnitude (or more)
faster than MySQL
"SQL doesn't solve all common problems" e.g. arbitrary attributes
(shop item sizes, colours etc).  Dynamic columns introduced in MariaDB
5.3.  As a POC, created a storage engine for Cassandra with MariaDB 10
Any close-sourced features that Oracle has added to MySQL have been
added to MariaDB as open-source features
5.5 introduces a new thread pool (instead of thread-per-connection)
Full merge of MySQL 5.6 into MariaDB 5.6 is a year-long project due to
broken features and new bugs, over-complicated vode, lack of
understanding of existing code etc
Did such a good job of getting the MySQL name out there, changing
everyone over to MariaDB is going to be a tough job!
Though creating a dev community is easier as Oracle is not working
with the community
Aim of MariaDB: make MySQL obselete
Free MariaDB + MySQL knowledgebase available at
askmonty.org

Brandon Keepers: Git: the NoSQL DB


Let's start with "Git's amazing ... what else can we do with it?"
"NoSQL is marketing bollocks" -- people mean non-relational and
schemaless, and anything else gets lumped in to NoSQL
git calls itself "the stupid content tracker" (see the man page)
git has three "object types": blobs, trees and commits, plus symbolic
"references" on top, all managed by the git command line tool
There are libraries to work with this (Grit, libgit2), plus ORMs built
on top, such as ToyStore
NoSQL allows us to question RDBMS design, including big design
up-front: schemaless allows us to be much more agile with our data
model
git can handle transaction in both short-lived (one commit with
multiple changes) and long-lived (branches) forms
Replication handled by the fact that all git repos are full clones
git doesn't have any of the features that makes a great DB: querying,
concurrency (it's filesystem based), merge conflict resolution, scale
Scale: filesystem based, and problems with git at scale.  Someone
tested with a very large repo: 4m commits, 1.3m files, 15Gb repo ...
git-add took 7 seconds etc...
Think about how you can abuse your tools to get more out of them

Peter Cooper: Redis, Steady, Go!


Peter's a Rubist, and wants his languages and tools to be "beautiful,"
which he considers Redis to be
Redis: remote [data structure] server -- no tables, no SQL, no
enforced relationships, lots of working with primitives.  The Redis
manifesto calls it a DSL for abstract data types
Like memcached but with more commands, more persistence, more data
types
Three big use cases: database, messaging (pub/sub, queueing), or as
a cache.  Also: fast live stats logging (why Redis was created in the
first place), rate limiting (using automatic key expiry),
scoreboarding (using sorted sets), IPC, session storage
YouP*rn.com uses Redis as their primary datastore (~100 Alexa ranking)
Redis is single-threaded and event-driven (apart from background
saving etc). Single-threading means individual operations are atomic
Python library redis_wrap means you can use normal Python data types,
backed by Redis
Recent additions: scripting with LUA, plus PostgreSQL data wrapper
6 data types: strings, lists, sets, sorted sets, lists, hashes
Abstract data type example: queueing using a list, with LPOP and
RPUSH. Priority queues implemented by using a BLPOP with multiple
list names
Set operations are available such as intersection, union, difference.
Also provides the ability to store intermediary results in new keys.
Hashes don't allow storage of other data types: strings only
Supports transactions using MULTI ... EXEC to run all queued
commands in one go
Master/slave replication is simple with the SLAVE OF ... command
Other updates and versions include Redis Sentinel (in development to
provide automated failover), Redis Cluster (in development for fault
tolerance of a subset of Redis commands) and a Windows version
Have a play with a "live demo" within the redis.io
documentation

Lisa Phillips: MySQL @Twitter


MySQL plus friends has enabled Twitter to still use MySQL (5.0 and
5.5) as its primary datastore, with an average off 400 million tweets
per say, 4,629/s average, with a peak at 25,088/s (about a Japanese
anime film!)
8 full-time DBAs (recently up from 6) managing thousands of MySQL
instances, supporting 100s of developers.  All DBAs have at-scale
experience, and most developers are familiar with MySQL
The Twitter DBAs manage from the bare-metal up, including operating
system, software, monitoring etc
Engineering in Twitter is about pragmatism: use commodity hardware
and software, queues and async processing, eventual consistency,
some delay tolerance (measured in seconds)
"Build new awesome tools (and open source them) if you need to"
They use "deciders": feature flags to enable roll-out to small volumes
of people to gauge impact on the DB servers (plus other parts of the
infrastructure)
Twitter don't roll back, either code or DB changes: they roll out
slowly and iterate on any fixes
Replication (usually) works.  Have seen replication break in lots of
different ways so many times, so can now quickly fix any problems.
Bad points of MySQL: at-scale ID generation, graphs, replication
inefficiencies and lag
"If you're using replication, make sure you can tolerate lag in your
code.  If you can't tolerate lag, don't use MySQL"
MySQL great for HA, "smaller" datasets (<1.5Tb)
Challenges: MySQL version diversity, single DBA, upgrades without HA
solution, no load-balancing for reads
In 2012, they used a sharded master-slave setup using temporal
sharding.  New shards were hot, old shards not.  New DB clusters being
built every week, and DBA time became limiting factor
Snowflake used for unique ID generation
Gizzard created for sharding as a replacement for the
temporal sharding (stores and replicates tweets, interest, social
graph) and replaces native MySQL replication (disabling native
replication improves performance)
Gizzard handles 6m SELECTs per second at peak, and creating more
than 3b records per day
Other apps built on top of Gizzard: Flock, TBird, TFlock -- all of
these are backed by MySQL
Still using traditional master-slave clusters (3-100 machines in a
cluster) for non-tweet data such as user metadata, old Rails models
One Twitter employee is an ex-MySQL developer who now just works on
MySQL for Twitter
Working on better loggin and auditing support, real-time monitoring,
performance and response metrics, row-based replication pre-fetching

Brian LeRoux: Mobile Web Persistence


Lawnchair: like CouchDB but smaller and outside
"Xcode is Eclipse that looks like iTunes, and it just as slow"
Cookies: need to be online, 4Kb storage, but there's a handy
hack for serving up responsive image
Can I use Web SQL Database? (oh, and JS is dumb)
SQL in the browser: SQLite (probably).  Started off as Google Gears,
but now improved.  http://caniuse.com/sql-storage  SQLite is an
implmentation not a standard, and Mozilla had issues with it.  Isn't
everywhere and doesn't necessarily work.
LocalStorage: quite nice, can store up to 5Mb, but has a synchronous
(blocking) API, plus misses complex types, and you can't query it.
Almost supported everywhere (e.g. Opera Mini)
WebSimpleDB: solve ALL the problems!  Renamed Indexed DB.  Has
querying etc, but is heavy on the code required because it's a
versioned DB.  Not supported in lots of places (yet), but could be
polyfilled.
Lawnchair wraps up all the above in one sane API
Hack: store unlimited data on all browsers, accessible from any
domain, using window.name
Web sockets means we could have a web page open a database connection
WebRTC PeerChannel and DataConnection APIs are also around
Stong indication of first-class File APIs coming to browsers.
Currently split in to two specs: File API and Directories and System
API.  filer.js tries to make this saner
Mozilla working on Archive API in Firefox OS

Craig Kersteins: Postgres Demystified


postgresapp.com -- simplified running of Postgres in OS X
"It's the emacs of databases": more of an OS for your data
psql is powerful command-line client
30+ datatypes including IPs, MAC addresses, geospatial, arrays
Native arrays give the power of custom fields without a join
Loads of extensions such as hstore: a KVP store inside a column,
with queryability
Simple native JSON type recently added, but PLV8 allows embedding of a
JS engine (can open up JS-injection attacks!)
Range types include from+to within a single column, and can have
checks on those (e.g. two entries can overlap in times)
Light geospatial stuff is built-in; PostGIS provides full geospatial
capabilities
Sequential scans are bad (most of the time).  Indexes are good (most
of the time)
Postgres has multiple types of indexes.  B-Tree is the default, and
you usually want it; BIN used with multiple values in one column
(arrays, hStore); GIST for full-text search and GIS
Aim for all queries being =< 10ms
Can create indexes concurrently without locking the whole table
Create indexes on certain conditions (e.g. active things only)
PG internal metrics can provide things like cache and index hit ratios
Window functions permit partioning (sub-grouping) data while querying
Fuzzy string matching using soundex()
Move data around using \copy or db_link, not SELECT + INSERT
Foreign storage adapters such as Redis: in this case can JOIN across
PG and Redis
Common table expressions allow naming of common queries, which can
then be reused in subsequent queries
Extras: Listen/notify (pub/sub within the DB), per-transaction
synchronous replication, SELECT for UPDATE
Replication introduced in 9.0, multi-master expected for 9.4
References: Postgres Guide and a presentation

Tim Moreton: Apache Cassandra and BASE


Facebook took bits of BigTable data model and Dynamo distribution to
create Cassandra, used to power their inbox search.  Open sourced it
in 2008, top-level Apache project as of 2010.  Now pretty prevalent
Multi-master (no SPOF), tunable consistency (multi-DC aware),
optimised for writes (do more up front to gain on select time), atomic
counters
Data model is a set of nested, sorted dictionaries.  Columns are
effectively just labels, and can be very wide
Reads are fast within a single row (across columns) but much slower
between rows, because rows are spread around the cluster
Uses timestamp-based reconcilliation for conflict resolution across
the cluster
Tunable consistency for both writes and reads: one, quorum, all
Use case: session store.  Read dominated, updates to existing items,
probably fits in RAM, distribute for availability, challenge:
atomicity
Use case: real-time analytics.  Write dominated, updates rare, read
"results" mostly, distribute for availablity + performance + capcity,
challenge: complex querying
Twitters promoted tweets dashboard just used Cassandra counters,
demormalising into buckets on writes, so the grouping etc is already
done for reading (no need for separate counting, grouping etc)
Relies on up-front knowledge of the use of the data to be able to
optimise for reading
Acunu Analytics: materialised views of data to provide better
queryability on top of Cassandra data