Skip to content

Instantly share code, notes, and snippets.

View mikewallace1979's full-sized avatar

Michael Wallace mikewallace1979

View GitHub Profile
@mikewallace1979
mikewallace1979 / stats.sh
Created October 29, 2012 11:30
Basic stats from stdin
#!/bin/bash
# Quick hacky script that computes basic statistics from a single
# column of numbers from stdin - inspired by the CouchDB _stats reducer
input=`cat`
sum=`echo "$input" | awk '{sum+=$1} END {print sum}'`
count=`echo "$input" | awk '{} END {print NR}'`
min=`echo "$input" | awk '{if (NR == 1 || $1 < min) min=$1} END {print min}'`
@mikewallace1979
mikewallace1979 / gist:4028144
Created November 6, 2012 22:37
Erlang view for base64-encoded zipped data
fun({Doc}) ->
case couch_util:get_value(<<"type">>, Doc) of
undefined ->
ok;
Type when Type == <<"results">> ->
TaskId = couch_util:get_value(<<"task_id">>, Doc),
JobId = couch_util:get_value(<<"job_id">>, Doc),
EmitCellData = fun(Data, [Key | Rest], CB) ->
case Key of
<<"soil_type">> ->
@mikewallace1979
mikewallace1979 / gist:5965951
Created July 10, 2013 12:38
CouchDB replication topology visualisation one-liner
curl -X GET http://localhost:5984/_active_tasks | jq -r '. | map(select(.type=="replication")) | map("\"",.source,"\"","->","\"",.target,"\"",";") | "digraph G { ranksep=10;ratio=auto;",.[],"}"'| twopi -Tpdf > topology.pdf
#include <Keypad.h>
const byte rows = 7;
const byte cols = 14;
char keys[rows][cols] = {
{'a', 'b', 'c', 'd', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14'},
{'e', 'f', 'g', 'h', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28'},
{'i', 'j', 'k', 'l', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42'},
{'m', 'n', 'o', 'p', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56'},

Seven questions to ask next time you're considering a "fault tolerant" database

It's 2015, the NoSQL hype is a distant memory and most of us are now assessing database systems on their own merits, measuring them against technical requirements and sometimes even using actual engineering techniques. So why are we still talking about "fault tolerance" as though the only possible fault is the complete failure of one or more nodes? The real world is rarely so simple; failure is often fuzzy and transient with nodes dropping in and out, timing out intermittently and returning unexpected responses. When it comes to databases the impact of these failures can be catastrophic if they are not handled appropriately.

Drawing on my experience operating Cloudant's database as a service platform (based on Apache CouchDB) I argue that the phrase "fault tolerant" is meaningless without further qualification and will present seven key questions that can help determine the real-world fault tolerance properties of a given data