Skip to content

Instantly share code, notes, and snippets.

@neunhoef
Last active February 26, 2016 09:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save neunhoef/4f85f4c64ba27df5492f to your computer and use it in GitHub Desktop.
Save neunhoef/4f85f4c64ba27df5492f to your computer and use it in GitHub Desktop.
Accompanying material for my talk at ThoughWorks 25 February 2016

Additional material for my talk "Deep Dive On ArangoDB"

Abstract

We will take a deep dive into [ArangoDB] (https://www.arangodb.com/) together with [Max] (https://www.linkedin.com/in/maxneunhoeffer) one of the core developers of the product.

ArangoDB is a multi-model database, which means that it is a document store, a key/value store and a graph database, all in one engine and with a query language that supports all three data models, as well as joins and transactions. Queries can use a single data model or can even mix them.

ArangoDB scales out horizontally with convenient cluster deployment using Apache Mesos. Furthermore, the HTTP API can easily be extended by server-side JavaScript code using high performance access to the C++ database core.

During the talk I will show all these features using several different cloud deployments, since in most projects one will not deploy a ArangoDB monolith, but rather multiple instances, each either a possibly replicated single server, or a cluster. This demonstrates that all these properties together make ArangoDB a very useful and valuable tool in modern microservice oriented architectures.

Overview over the material in this gist

There are a few JSON files which I used to deploy 4 instances of ArangoDB (3x cluster, one single server) on an Apache Mesos cluster.

I also include some information as to how I set up the data.

Furthermore, there are a few queries I conducted on the various collections.

Deployment

I simply used the following bash script:

#!/bin/bash
curl -X POST -H "Content-Type: application/json" http://10.240.0.2:8080/v2/apps -d "@$1" >/dev/null
echo
echo Cluster start ordered...

Which contacts Marathon on my Mesos cluster running on GCE. The JSON files I used for the deployment are:

keyvalue.json:

{
  "id": "arangokv",
  "cpus": 0.125,
  "mem": 1024.0,
  "ports": [0, 0],
  "instances": 1,
  "args": [
    "framework",
    "--framework_name=arangokv",
    "--master=zk://10.240.0.2:2181/mesos",
    "--zk=zk://10.240.0.2:2181/arangodb",
    "--user=",
    "--principal=prikv",
    "--role=arangokv",
    "--mode=cluster",
    "--async_replication=true",
    "--minimal_resources_agent=mem(*):512;cpus(*):0.125;disk(*):512",
    "--minimal_resources_dbserver=mem(*):4096;cpus(*):3;disk(*):4096",
    "--minimal_resources_secondary=mem(*):4096;cpus(*):1.5;disk(*):4096",
    "--minimal_resources_coordinator=mem(*):4096;cpus(*):3;disk(*):4096",
    "--nr_agents=1",
    "--nr_dbservers=6",
    "--nr_coordinators=6",
    "--failover_timeout=604800",
    "--secondaries_with_dbservers=true",
    "--coordinators_with_dbservers=true",
    "--arangodb_privileged_image=false",
    "--arangodb_image=arangodb/arangodb-mesos:2.8.3"
  ],
  "env": {
    "ARANGODB_WEBUI_HOST": "",
    "ARANGODB_WEBUI_PORT": "0",
    "MESOS_AUTHENTICATE": "",
    "ARANGODB_SECRET": ""
  },
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "arangodb/arangodb-mesos-framework:mesosphere-V2",
      "forcePullImage": true,
      "network": "HOST"
    }
  },
  "healthChecks": [
    {
      "protocol": "HTTP",
      "path": "/v1/health.json",
      "gracePeriodSeconds": 3,
      "intervalSeconds": 10,
      "portIndex": 0,
      "timeoutSeconds": 10,
      "maxConsecutiveFailures": 0
    }
  ]
}

microservice.json:

{
  "id": "arangoms",
  "cpus": 0.125,
  "mem": 1024.0,
  "ports": [0, 0],
  "instances": 1,
  "args": [
    "framework",
    "--framework_name=arangoms",
    "--master=zk://10.240.0.2:2181/mesos",
    "--zk=zk://10.240.0.2:2181/arangodb",
    "--user=",
    "--principal=prims",
    "--role=arangoms",
    "--mode=cluster",
    "--async_replication=true",
    "--minimal_resources_agent=mem(*):512;cpus(*):0.125;disk(*):512",
    "--minimal_resources_dbserver=mem(*):4096;cpus(*):3;disk(*):4096",
    "--minimal_resources_secondary=mem(*):4096;cpus(*):1.5;disk(*):4096",
    "--minimal_resources_coordinator=mem(*):4096;cpus(*):3;disk(*):4096",
    "--nr_agents=1",
    "--nr_dbservers=2",
    "--nr_coordinators=2",
    "--failover_timeout=604800",
    "--secondaries_with_dbservers=true",
    "--coordinators_with_dbservers=true",
    "--arangodb_privileged_image=false",
    "--arangodb_image=arangodb/arangodb-mesos:2.8.3"
  ],
  "env": {
    "ARANGODB_WEBUI_HOST": "",
    "ARANGODB_WEBUI_PORT": "0",
    "MESOS_AUTHENTICATE": "",
    "ARANGODB_SECRET": ""
  },
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "arangodb/arangodb-mesos-framework:mesosphere-V2",
      "forcePullImage": true,
      "network": "HOST"
    }
  },
  "healthChecks": [
    {
      "protocol": "HTTP",
      "path": "/v1/health.json",
      "gracePeriodSeconds": 3,
      "intervalSeconds": 10,
      "portIndex": 0,
      "timeoutSeconds": 10,
      "maxConsecutiveFailures": 0
    }
  ]
}

multi.json:

{
  "id": "arangomu",
  "cpus": 0.125,
  "mem": 1024.0,
  "ports": [0, 0],
  "instances": 1,
  "args": [
    "framework",
    "--framework_name=arangomu",
    "--master=zk://10.240.0.2:2181/mesos",
    "--zk=zk://10.240.0.2:2181/arangodb",
    "--user=",
    "--principal=primu",
    "--role=arangomu",
    "--mode=cluster",
    "--async_replication=true",
    "--minimal_resources_agent=mem(*):512;cpus(*):0.125;disk(*):512",
    "--minimal_resources_dbserver=mem(*):4096;cpus(*):3;disk(*):4096",
    "--minimal_resources_secondary=mem(*):4096;cpus(*):1.5;disk(*):4096",
    "--minimal_resources_coordinator=mem(*):4096;cpus(*):3;disk(*):4096",
    "--nr_agents=1",
    "--nr_dbservers=3",
    "--nr_coordinators=3",
    "--failover_timeout=604800",
    "--secondaries_with_dbservers=true",
    "--coordinators_with_dbservers=true",
    "--arangodb_privileged_image=false",
    "--arangodb_image=arangodb/arangodb-mesos:2.8.3"
  ],
  "env": {
    "ARANGODB_WEBUI_HOST": "",
    "ARANGODB_WEBUI_PORT": "0",
    "MESOS_AUTHENTICATE": "",
    "ARANGODB_SECRET": ""
  },
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "arangodb/arangodb-mesos-framework:mesosphere-V2",
      "forcePullImage": true,
      "network": "HOST"
    }
  },
  "healthChecks": [
    {
      "protocol": "HTTP",
      "path": "/v1/health.json",
      "gracePeriodSeconds": 3,
      "intervalSeconds": 10,
      "portIndex": 0,
      "timeoutSeconds": 10,
      "maxConsecutiveFailures": 0
    }
  ]
}

graph.json:

{
  "id": "arangogr",
  "cpus": 7,
  "mem": 8192,
  "disk": 8192,
  "ports": [0],
  "instances": 1,
  "command": "arangod",
  "args": [
  ],
  "env": {
    "ARANGO_NO_AUTH": "true"
  },
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "arangodb",
      "forcePullImage": false,
      "network": "BRIDGE",
      "portMappings": [
         { "containerPort": 8529, "hostPort": 0 }
      ]
    }
  },
  "healthChecks": [
    {
      "protocol": "HTTP",
      "path": "/_api/version",
      "gracePeriodSeconds": 3,
      "intervalSeconds": 10,
      "portIndex": 0,
      "timeoutSeconds": 10,
      "maxConsecutiveFailures": 0
    }
  ]
}

The latter starts a single instance, the 3 first ones start ArangoDB clusters.

For the deployment of the Apache Mesos Cluster I used the script GoogleComputeEngine_Mesos_Cluster.sh from this repository.

Setting up the data

In the key/value deployment, I created the data with the following JavaScript code (directly on an ArangoDB server):

c = db._create("data", {numberOfShards: 6});
for (i = 0;i < 10000000; i++) { 
  c.insert({value:i % 10000, 
            name: "N"+(i*17*17 % 1234567889)}); 
  if (i % 10000 == 0) {
    require("internal").print(i); 
  } 
}
c.ensureIndex({type: "skiplist", fields: ["value"]});
c.ensureIndex({type: "hash", fields: ["name"]});

This can be dumped with this command:

arangodump --output-directory data --server.endpoint tcp://10.240.0.9:31002/

and restored to a different deployment with:

arangorestore --input-directory data --server.endpoint tcp://10.240.0.9:31002/

For the graph deployment, I used one of our shipped sample graphs, by executing:

example = require("@arangodb/graph-examples/example-graph");
example.loadGraph("worldCountry");

As Foxx app, I used the aye-aye app from our app store.

Some AQL queries I showed

For the key/value deployment:

INSERT { _key: "K1234", value: 1234, name: "Hugo" } IN data
INSERT { value: 1777, name: "Max" } IN data

UPDATE "K1234" WITH { value: 1235 } IN data

REPLACE "K1234" WITH { value: 1234, name: "Phil" } IN data

REMOVE "K1234" IN data

FOR d IN data
  FILTER d.value >= 1000 && d.value < 2000
  SORT d.value
  LIMIT 100
  RETURN d

For the graph deployment:

FOR v IN 1..1 INBOUND "worldVertices/world" GRAPH "worldCountry"
  RETURN v

FOR v IN 2..2 INBOUND "worldVertices/world" GRAPH "worldCountry"
  RETURN v

FOR v IN 1..2 INBOUND "worldVertices/world" GRAPH "worldCountry"
  RETURN v

FOR v, e, p IN 1..2 INBOUND "worldVertices/world" GRAPH "worldCountry"
  RETURN {vertex:v, edge: e, path: p}

FOR v,e,p IN 2 INBOUND "worldVertices/world" GRAPH "worldCountry"
  FILTER p.vertices[1].name == "Europe" 
  RETURN [v,e,p]

FOR v IN 2 INBOUND "worldVertices/continent-europe" GRAPH "worldCountry"
  FILTER v.type == "capital"
  RETURN v
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment