Skip to content

Instantly share code, notes, and snippets.

View neunhoef's full-sized avatar

Max Neunhöffer neunhoef

View GitHub Profile
@neunhoef
neunhoef / ThoughtWorksTalkMaterial.md
Last active February 26, 2016 09:26
Accompanying material for my talk at ThoughWorks 25 February 2016
@neunhoef
neunhoef / graphs_emperor.md
Last active August 29, 2015 14:16
Graphs in data modeling - is the emperor naked?

This text is about graphs in data modeling, possibilities for their implementation, about different data stores using different data models and query languages. The fundamental question I would like to answer is:

Are graphs and graph databases useful in data modeling, and if so, for what and under which circumstances?

The purpose of this document is to sort out some things in my brain. If others like the ideas, find them enlightening or disgusting, or do not care, then I do not really care myself.

Theoretical approach

Mathematically, a graph (directed, unlabelled, without multiple edges) is nothing but a relation. It consists of a set V of vertices and a subset E (the edges) of the Cartesian product V x V. There is an edge from v to w, if and only if the pair (v,w) is contained in E. Similarly, a bipartite graph is just a subset of a Cartesian product A x B for two disjoint sets A and B.

@neunhoef
neunhoef / schema_less.md
Last active August 29, 2015 14:16
Agile development vs. schema enforcement - a paradox resolved

The fans of modern and agile software development usually propose to use schemaless database engines to allow for greater flexibility, in particular during the early rapid prototyping phase of IT projects. The more traditionally minded insist that having a strict schema that is enforced by the persistence layer throughout the lifetime of a project is necessary to ensure quality and security.

In this post I would like to explain briefly, why I believe that both groups are completely right and why this is not so paradoxical as it sounds at first glance.

I am one the developers of ArangoDB, which is a multi-model NoSQL database, by which I mean an engine that is a document store, a key/value store as well as a graph database with a query language that allows to use and indeed mix all three data models in queries.

As a document store, ArangoDB is schemaless, which is usually very convenient in the beginning of a software project, where the actual schema is not yet completely clear