Skip to content

Instantly share code, notes, and snippets.

@mkoertgen
Last active November 5, 2018 09:31
Show Gist options
  • Save mkoertgen/6180ff6cd1ca256b36f94774ce65b435 to your computer and use it in GitHub Desktop.
Save mkoertgen/6180ff6cd1ca256b36f94774ce65b435 to your computer and use it in GitHub Desktop.
Fun with neo4j intersections

Fun with neo4j intersections

Suppose you are about to host an event and want to invite customers. The event is so extraordinary that you want to invite customers only having at least one department in Berlin and also in Munich.

Setup

First, setup some example customers (Tom, Dick & Harry):

create (tom:Customer {name:'Tom'})
create (dick:Customer {name:'Dick'})
create (harry:Customer {name:'Harry'})
create (berlin:Address {name:'Berlin'})
create (munich:Address {name:'Munich'})
create (tom)-[e1:LIVES_IN]->(berlin)
create (tom)-[e1:LIVES_IN]->(munich)
create (dick)-[e1:LIVES_IN]->(berlin)
create (harry)-[e1:LIVES_IN]->(munich)

All customers

How to get it wrong

Why is this tricky? The usual MATCH clause specifies relationships/paths which are matched while traversing local edges, i.e. there will be no single relation that can possibly match both constraints.

Here are some basic yet wrong queries

Single address cannot match both cities

MATCH (c:Customer)-[:LIVES_IN]->(a:Address)
WHERE a.city = 'Berlin' AND a.city = 'Munich'
RETURN c

Multi match

MATCH (c:Customer)-[:LIVES_IN]->(berlin:Address {name:'Berlin'})
MATCH (c:Customer)-[:LIVES_IN]->(munich:Address {name:'Munich'})
RETURN c

Matching occurs while traversing a single/local path, much like the visitor pattern or the Depth-First-Traversal (DFS). Considering the local adjacency of one node (or one edge) at a time cannot yield the intended solution.

Customers living in any of the cities

WITH ['Berlin', 'Munich'] as cities
MATCH (c:Customer)-[:LIVES_IN]->(a:Address)
WHERE a.city in cities

This matches all customers living in any of the cities (but not all).

Solution

Due to the local nature of neo4j match traversal we need to aggregate local matches along the way, then filter them down.

This is basically a 2-pass approach which can be realized in neo4j using the WITH-clause. And it goes like this

WITH ['Berlin', 'Munich'] as cities
MATCH (c:Customer)-[:LIVES_IN]->(a:Address)
WHERE a.city in cities
WITH c, collect(a) as b, size(cities) as inputCnt, count(DISTINCT a) as cnt
WHERE cnt = inputCnt
RETURN c, b, cnt

In the first pass we find all customers c living in any of the cities. Then we make use of both the COLLECT and the DISTINCT-operator so we can easily check on the result set cardinality.

This should return Tom as the only customer both living in Berlin and Munich.

Customers to invite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment