Skip to content

Instantly share code, notes, and snippets.

@rvanbruggen
Last active August 18, 2022 16:11
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save rvanbruggen/a916e640336f91767be23ea9288fccc3 to your computer and use it in GitHub Desktop.
Save rvanbruggen/a916e640336f91767be23ea9288fccc3 to your computer and use it in GitHub Desktop.
Contact tracing example #cypher #neo4j
//environment: Neo4j Desktop 1.2.7, Neo4j Enteprise 3.5.17, apoc 3.5.0.9, gds 1.1.0
//or: Neo4j Enterprise 4.0.3, apoc 4.0.0.6 (NOT later! a bug in apoc.coll.max/apoc.coll.min needs to be resolved)
//contact tracing data import
//full spreadsheet with synthetic data
//https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/edit#gid=0
// person sheet˝
// https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=0
// Person: PersonId PersonName Healthstatus ConfirmedTime
// place sheet
// https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=205425553
// Place: PlaceId PlaceName PlaceType Lat Long
// visits sheet
// https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=1261126668
// VisitId PersonId PlaceId StartTime EndTime
//import the persons
load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=0" as csv
create (p:Person {id: csv.PersonId, name:csv.PersonName, healthstatus:csv.Healthstatus, confirmedtime:datetime(csv.ConfirmedTime), addresslocation:point({x: toFloat(csv.AddressLat), y: toFloat(csv.AddressLong)})});
//import the places
load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=205425553" as csv
create (p:Place {id: csv.PlaceId, name:csv.PlaceName, type:csv.PlaceType, homelocation:point({x: toFloat(csv.Lat), y: toFloat(csv.Long)})});
create index on :Place(id);
create index on :Place(location);
create index on :Place(name);
create index on :Person(id);
create index on :Person(name);
create index on :Person(healthstatus);
create index on :Person(confirmedtime);
//import the VISITS
//loading duplicate info here: both with a VISIT node, and a VISITS relationship
load csv with headers from
"https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=1261126668" as csv
match (p:Person {id:csv.PersonId}), (pl:Place {id:csv.PlaceId})
create (p)-[:PERFORMS_VISIT]->(v:Visit {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)})-[:LOCATED_AT]->(pl)
create (p)-[vi:VISITS {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)}]->(pl)
set v.duration=duration.inSeconds(v.starttime,v.endtime)
set vi.duration=duration.inSeconds(vi.starttime,vi.endtime);
//connect places to Region
create (r:Region {name:"Antwerp"})-[:PART_OF]->(c:Country {name:"Belgium"})-[:PART_OF]->(co:Continent {name:"Europe"});
match (r:Region {name:"Antwerp"}), (pl:Place)
create (pl)-[:PART_OF]->(r);
//environment: Neo4j Desktop 1.2.7, Neo4j Enteprise 3.5.17, apoc 3.5.0.9, gds 1.1.0
//or: Neo4j Enterprise 4.0.3, apoc 4.0.0.6 (NOT later! a bug in apoc.coll.max/apoc.coll.min needs to be resolved)
//contract tracing queries
//who has a sick person potentially infected
match (p:Person {healthstatus:"Sick"})
with p
limit 1
match (p)--(v1:Visit)--(pl:Place)--(v2:Visit)--(p2:Person {healthstatus:"Healthy"})
return p.name as Spreader, v1.starttime as SpreaderStarttime, v2.endtime as SpreaderEndtime, pl.name as PlaceVisited, p2.name as Target, v2.starttime as TargetStarttime, v2.endtime as TargetEndttime;
//who has a sick person potentially infected - VISUAL
match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)-->(v1:Visit)-->(pl:Place)<--(v2:Visit)<--(p2:Person {healthstatus:"Healthy"})
return path;
//simplifying the query by using the VISITS relationship
match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)-[:VISITS]->(pl:Place)<-[:VISITS]-(p2:Person {healthstatus:"Healthy"})
return path
//who has a sick person infected - with time overlap
//The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.
match (p:Person {healthstatus:"Sick"})-->(v1:Visit)-->(pl:Place)
with p,v1,pl
limit 10
match path = (p)-->(v1)-->(pl)<--(v2:Visit)<--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return path;
//who has a sick person infected - with time overlap AND SIMPLIFIED
//The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.
match (p:Person {healthstatus:"Sick"})-[v1:VISITS]->(pl:Place)
with p,v1,pl
limit 10
match path = (p)-[v1]->(pl)<-[v2:VISITS]-(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return path
//who has a sick person infected - with time overlap +/- 2hrs
//The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.
match (p:Person {healthstatus:"Sick"})-->(s1:Stay)-->(pl:Place)
with p,s1,pl
limit 10
match path = (p)-->(s1)-->(pl)<--(s2:Stay)<--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([s1.starttime.epochMillis, s2.starttime.epochMillis]) as maxStart,
apoc.coll.min([s1.endtime.epochMillis, s2.endtime.epochMillis]) as minEnd
where maxStart-720000 <= minEnd+720000
return path
//find sick person that has visited places since being infected
match (p:Person {healthstatus:"Sick"})-[visited]->(pl:Place)
where p.confirmedtime < visited.starttime
return p, visited, pl
limit 10;
//find sick person that has visited a place more than once, and
match (pl:Place)<-[v2]-(p:Person {healthstatus:"Sick"})-[v1]->(pl:Place)
where p.confirmedtime > v1.starttime
or p.confirmedtime > v2.starttime
return *
//find connections between sick people
match (p1:Person {healthstatus:"Sick"}),(p2:Person {healthstatus:"Sick"})
where id(p1)<id(p2)
with p1, p2
match path = allshortestpaths ((p1)-[*]-(p2))
return path
limit 10;
//how many sick and healthy people
match (p:Person)
return distinct p.healthstatus, count(*)
//which healthy person has the highest risk - based on amount over overlaptime with sick people
match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return hp.name, hp.healthstatus, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc;
//which healthy person has the highest risk - based on amount over overlaptime with sick people - VISUAL
match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
with hp, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc
limit 10
match (hp)-[v]-(pl:Place)
return hp,v,pl;
//places with most sick visits
match (p:Person {healthstatus:"Sick"})-[v:VISITS]->(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, apoc.node.degree.in(pl,'VISITS') as totalnrofvisits
order by nrofsickvisits desc
limit 10
return placename, nrofsickvisits, totalnrofvisits, round(toFloat(nrofsickvisits)/toFloat(totalnrofvisits)*10000)/100 as percentageofsickvisits
//places with most sick visits - VISUAL
match (p:Person {healthstatus:"Sick"})-[v:VISITS]->(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, pl
order by nrofsickvisits desc
limit 10
match (pl)<-[v]-(p:Person)
return pl,p,v
//environment: Neo4j Desktop 1.2.7, Neo4j Enteprise 3.5.17, apoc 3.5.0.9, gds 1.1.0
//graph analytics on contact tracing database
// see https://gist.github.com/rvanbruggen/a916e640336f91767be23ea9288fccc3#file-3-contracttracing-analytics-cql
//REQUIREMENT: create the MEETS relationship based on OVERLAPTIME
//This is a relationship between two PERSON nodes
match (p1:Person)-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(p2:Person)
where id(p1)<id(p2)
with p1, p2, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
with p1, p2, sum(minEnd-maxStart) as meetTime
create (p1)-[:MEETS {meettime: duration({seconds: meetTime/1000})}]->(p2);
//calculating pagerank of Persons
:param limit => (10);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {}
}
},
relationshipWeightProperty: null,
dampingFactor: 0.85,
maxIterations: 20,
writeProperty: 'pagerank'
});
CALL gds.pageRank.write($config);
//Person pagerank table results
MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node[$config.writeProperty] AS pagerank, node.betweenness as betweenness
ORDER BY pagerank DESC
LIMIT 10
//Person pagerank graph VISUAL results
MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn
//BETWEENNESS of Person nodes
:param limit => (20);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {}
}
},
writeProperty: 'betweenness'
});
CALL gds.alpha.betweenness.write($config);
//Person betweenness results table
MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node.pagerank as pagerank, node[$config.writeProperty] AS betweenness
ORDER BY betweenness DESC
LIMIT 10;
//Person betweenness results VISUAL
MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn;
//LOUVAIN Community detection
//preparation for relationship weight property: needs integer, is currently set up as a duration!
MATCH p=()-[r:MEETS]->()
set r.meettimeinseconds=r.meettime.seconds;
//Calculate communities using Louvain
:param limit => ( 50);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {
meettimeinseconds: {
property: 'meettimeinseconds',
defaultValue: 1
}
}
}
},
relationshipWeightProperty: 'meettimeinseconds',
includeIntermediateCommunities: false,
seedProperty: '',
writeProperty: 'louvain'
});
CALL gds.louvain.write($config);
// MATCH (node)
// WHERE not(node[$config.writeProperty] is null)
// WITH node, node[$config.writeProperty] AS community
// RETURN node,
// CASE WHEN apoc.meta.type(community) = "long[]" THEN community[-1] ELSE community END AS community,
// CASE WHEN apoc.meta.type(community) = "long[]" THEN community ELSE null END as communities
// LIMIT $limit
//what are the different communities
match (p:Person)
return distinct p.louvain, count(p)
order by count(p) desc;
//explore community 457
match (p1:Person {louvain: 457})-[v:VISITS]->(pl:Place), (p1)-[m:MEETS]->(p2:Person)
return p1, p2, pl, v, m;
//explore community 489
match (p1:Person {louvain: 489})-[v:VISITS]->(pl:Place), (p1)-[m:MEETS]->(p2:Person)
return p1, p2, pl, v, m;
//revisiting PAGERANK: using meettimeinseconds as weight
:param limit => (10);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: 'meettimeinseconds'
}
},
dampingFactor: 0.85,
maxIterations: 20,
writeProperty: 'weightedpagerank',
relationshipWeightProperty: 'meettimeinseconds'
});
CALL gds.pageRank.write($config);
//environment: Neo4j Desktop 1.2.7, Neo4j Enterprise 4.0.3, apoc 4.0.0.6 (NOT later! a bug in apoc.coll.max/apoc.coll.min needs to be resolved)
//implement security on Contact Tracing Graph in Neo4j 4.0
//create the patient user
:use system;
CREATE USER patient SET PASSWORD "changeme" CHANGE NOT REQUIRED;
//create the patientrole based on the reader role
:use system;
CREATE ROLE patientrole AS COPY OF reader;
//show the roles
:use system;
SHOW ROLES;
//put the patient user into the patientrole
:use system;
GRANT ROLE patientrole TO patient;
//Add read restriction on person names for patients
:use system;
DENY READ {name} ON GRAPH `neo4j` NODES Person TO patientrole;
//run pathfinding query between two patients, healthy and sick
:use neo4j;
MATCH (p1:Person {healthstatus:"Healthy"}), (p2:Person {healthstatus:"Sick"}),
path = allshortestpaths ((p1)-[*]-(p2))
RETURN path
limit 10;
//Additional restriction on security: traversal over MEETS relationship
:use system;
DENY TRAVERSE ON GRAPH `neo4j` RELATIONSHIPS MEETS to patientrole;
//show the privileges
SHOW USER patient PRIVILEGES;
//in case you want to remove the restrictions and start over
:use system;
REVOKE DENY READ {name} ON GRAPH `neo4j` NODES Person from patientrole;
REVOKE DENY TRAVERSE ON GRAPH `neo4j` RELATIONSHIPS MEETS from patientrole;
{"k98w3676.2b":{"name":"Contact Tracing","layerType":"cypher","latitudeProperty":{"value":"location","label":"location"},"longitudeProperty":{"value":"location","label":"location"},"tooltipProperty":{"value":"location","label":"location"},"nodeLabel":[{"value":"Place","label":"Place"}],"propertyNames":[{"value":"id","label":"id"},{"value":"location","label":"location"},{"value":"name","label":"name"},{"value":"type","label":"type"},{"value":"","label":""}],"spatialLayers":[],"data":[{"pos":[51.21478674,4.411022709],"tooltip":"Park"},{"pos":[51.20972408,4.392873698],"tooltip":"Park"},{"pos":[51.21526687,4.405958494],"tooltip":"Park"},{"pos":[51.20728622,4.394641133],"tooltip":"Hospital"},{"pos":[51.2131025,4.406564972],"tooltip":"Grocery shop"},{"pos":[51.2061557,4.397092639],"tooltip":"Grocery shop"},{"pos":[51.21773313,4.408502139],"tooltip":"Park"},{"pos":[51.20447327,4.401567516],"tooltip":"Grocery shop"},{"pos":[51.21229956,4.409084133],"tooltip":"Hospital"},{"pos":[51.20706673,4.396476941],"tooltip":"Theater"},{"pos":[51.21344951,4.411334813],"tooltip":"Theater"},{"pos":[51.20396134,4.39294067],"tooltip":"School"},{"pos":[51.22012718,4.406482841],"tooltip":"Theater"},{"pos":[51.20724581,4.399652538],"tooltip":"Park"},{"pos":[51.21474766,4.412749307],"tooltip":"Hospital"},{"pos":[51.20127547,4.398005587],"tooltip":"School"},{"pos":[51.21986474,4.410060749],"tooltip":"Mall"},{"pos":[51.20563464,4.40182581],"tooltip":"Theater"},{"pos":[51.21991332,4.409563035],"tooltip":"School"},{"pos":[51.2033219,4.392870992],"tooltip":"Grocery shop"},{"pos":[51.21304193,4.409229802],"tooltip":"Park"},{"pos":[51.20364669,4.393538288],"tooltip":"Hospital"},{"pos":[51.22011303,4.412384706],"tooltip":"Hospital"},{"pos":[51.20399473,4.401512383],"tooltip":"Hospital"},{"pos":[51.21227323,4.41123999],"tooltip":"Mall"},{"pos":[51.20637691,4.393928175],"tooltip":"Bar"},{"pos":[51.2111867,4.408463929],"tooltip":"Mall"},{"pos":[51.20624607,4.402101752],"tooltip":"Grocery shop"},{"pos":[51.2161537,4.403914519],"tooltip":"Theater"},{"pos":[51.20595447,4.395116441],"tooltip":"Grocery shop"},{"pos":[51.2144452,4.402822928],"tooltip":"Hospital"},{"pos":[51.2052184,4.394183092],"tooltip":"Park"},{"pos":[51.21778169,4.404403141],"tooltip":"Theater"},{"pos":[51.20362446,4.399403572],"tooltip":"Bar"},{"pos":[51.21194987,4.403848022],"tooltip":"Bar"},{"pos":[51.20640312,4.4005624],"tooltip":"Bar"},{"pos":[51.2140311,4.405409624],"tooltip":"Mall"},{"pos":[51.20604117,4.400093935],"tooltip":"Restaurant"},{"pos":[51.2140264,4.403434565],"tooltip":"Grocery shop"},{"pos":[51.20871693,4.395193011],"tooltip":"Grocery shop"},{"pos":[51.21418988,4.412234132],"tooltip":"Hospital"},{"pos":[51.20171423,4.399958068],"tooltip":"Grocery shop"},{"pos":[51.2199784,4.404132558],"tooltip":"Mall"},{"pos":[51.2048779,4.394811666],"tooltip":"Park"},{"pos":[51.21401282,4.409293532],"tooltip":"Mall"},{"pos":[51.20557071,4.398599086],"tooltip":"Mall"},{"pos":[51.21711614,4.406531141],"tooltip":"Grocery shop"},{"pos":[51.20051589,4.397617897],"tooltip":"Mall"},{"pos":[51.21460631,4.403542422],"tooltip":"Park"},{"pos":[51.20318011,4.401307714],"tooltip":"School"},{"pos":[51.21830424,4.409348713],"tooltip":"Park"},{"pos":[51.20906724,4.396888844],"tooltip":"School"},{"pos":[51.21486899,4.404414938],"tooltip":"Bar"},{"pos":[51.20133553,4.397371355],"tooltip":"Park"},{"pos":[51.21425705,4.412361412],"tooltip":"Bar"},{"pos":[51.20168117,4.401083779],"tooltip":"Theater"},{"pos":[51.21145652,4.405542157],"tooltip":"Park"},{"pos":[51.20839356,4.39524594],"tooltip":"Restaurant"},{"pos":[51.21377373,4.408406256],"tooltip":"Mall"},{"pos":[51.21010598,4.394717212],"tooltip":"Park"},{"pos":[51.21142025,4.412572186],"tooltip":"Bar"},{"pos":[51.20604674,4.395505177],"tooltip":"Bar"},{"pos":[51.21300511,4.405236187],"tooltip":"Bar"},{"pos":[51.20582531,4.396066545],"tooltip":"Grocery shop"},{"pos":[51.21978896,4.403929789],"tooltip":"Mall"},{"pos":[51.20403647,4.400050163],"tooltip":"Grocery shop"},{"pos":[51.21368252,4.410982846],"tooltip":"School"},{"pos":[51.20058052,4.398629876],"tooltip":"Grocery shop"},{"pos":[51.21741902,4.40729968],"tooltip":"Grocery shop"},{"pos":[51.20071063,4.397693194],"tooltip":"Theater"},{"pos":[51.22001408,4.408398888],"tooltip":"Grocery shop"},{"pos":[51.20184264,4.393161634],"tooltip":"Mall"},{"pos":[51.21343301,4.404002345],"tooltip":"Park"},{"pos":[51.20528942,4.401922961],"tooltip":"Bar"},{"pos":[51.21466685,4.402905148],"tooltip":"Hospital"},{"pos":[51.20178759,4.396088374],"tooltip":"Park"},{"pos":[51.21149793,4.406253562],"tooltip":"Park"},{"pos":[51.20222415,4.396414183],"tooltip":"Bar"},{"pos":[51.21590363,4.40775903],"tooltip":"Mall"},{"pos":[51.20988124,4.398144126],"tooltip":"Park"},{"pos":[51.22006385,4.408558532],"tooltip":"Theater"},{"pos":[51.20411776,4.397663087],"tooltip":"School"},{"pos":[51.21053106,4.406388958],"tooltip":"Park"},{"pos":[51.20729616,4.394164337],"tooltip":"Park"},{"pos":[51.21348985,4.404188793],"tooltip":"Park"},{"pos":[51.20732246,4.402010998],"tooltip":"Hospital"},{"pos":[51.21084109,4.407105784],"tooltip":"Park"},{"pos":[51.20605925,4.401139547],"tooltip":"Theater"},{"pos":[51.21372889,4.41082487],"tooltip":"Hospital"},{"pos":[51.20353869,4.396261669],"tooltip":"Theater"},{"pos":[51.21819087,4.409005526],"tooltip":"Hospital"},{"pos":[51.20707028,4.39685452],"tooltip":"Theater"},{"pos":[51.21092435,4.406019875],"tooltip":"Restaurant"},{"pos":[51.20151118,4.397103842],"tooltip":"Theater"},{"pos":[51.21126627,4.409918855],"tooltip":"Hospital"},{"pos":[51.2089336,4.395725713],"tooltip":"Theater"},{"pos":[51.21589044,4.403149031],"tooltip":"Mall"},{"pos":[51.2052519,4.394618881],"tooltip":"Restaurant"},{"pos":[51.21341232,4.407741261],"tooltip":"School"},{"pos":[51.20391064,4.401799525],"tooltip":"Bar"},{"pos":[51.20420746,4.394962752],"tooltip":"Grocery shop"}],"position":[51.209943396138634,4.4023929313168315],"color":{"r":208,"g":2,"b":27,"a":1},"limit":10000,"rendering":"markers","radius":30,"cypher":"MATCH (n:Place)\nRETURN n.location.x as latitude, n.location.y as longitude, n.type as tooltip\nLIMIT 10000","hasSpatialPlugin":false,"spatialLayer":{"value":"","label":""},"ukey":"k98w3676.2b","nodes":[{"value":"Continent","label":"Continent"},{"value":"Country","label":"Country"},{"value":"Person","label":"Person"},{"value":"Place","label":"Place"},{"value":"Region","label":"Region"},{"value":"Visit","label":"Visit"}]},"k98w76z3.ic":{"name":"Contact Tracing Heatmap","layerType":"cypher","latitudeProperty":{"value":"latitude","label":"latitude"},"longitudeProperty":{"value":"longitude","label":"longitude"},"tooltipProperty":{"value":"","label":""},"nodeLabel":[],"propertyNames":[{"value":"betweenness","label":"betweenness"},{"value":"confirmedtime","label":"confirmedtime"},{"value":"duration","label":"duration"},{"value":"endtime","label":"endtime"},{"value":"healthstatus","label":"healthstatus"},{"value":"id","label":"id"},{"value":"location","label":"location"},{"value":"louvain","label":"louvain"},{"value":"meettime","label":"meettime"},{"value":"meettimeinseconds","label":"meettimeinseconds"},{"value":"name","label":"name"},{"value":"pagerank","label":"pagerank"},{"value":"starttime","label":"starttime"},{"value":"type","label":"type"},{"value":"","label":""}],"spatialLayers":[],"data":[{"pos":[51.21478674,4.411022709]},{"pos":[51.20972408,4.392873698]},{"pos":[51.21526687,4.405958494]},{"pos":[51.20728622,4.394641133]},{"pos":[51.2131025,4.406564972]},{"pos":[51.2061557,4.397092639]},{"pos":[51.21773313,4.408502139]},{"pos":[51.20447327,4.401567516]},{"pos":[51.21229956,4.409084133]},{"pos":[51.20706673,4.396476941]},{"pos":[51.21344951,4.411334813]},{"pos":[51.20396134,4.39294067]},{"pos":[51.22012718,4.406482841]},{"pos":[51.20724581,4.399652538]},{"pos":[51.21474766,4.412749307]},{"pos":[51.20127547,4.398005587]},{"pos":[51.21986474,4.410060749]},{"pos":[51.20563464,4.40182581]},{"pos":[51.21991332,4.409563035]},{"pos":[51.2033219,4.392870992]},{"pos":[51.21304193,4.409229802]},{"pos":[51.20364669,4.393538288]},{"pos":[51.22011303,4.412384706]},{"pos":[51.20399473,4.401512383]},{"pos":[51.21227323,4.41123999]},{"pos":[51.20637691,4.393928175]},{"pos":[51.2111867,4.408463929]},{"pos":[51.20624607,4.402101752]},{"pos":[51.2161537,4.403914519]},{"pos":[51.20595447,4.395116441]},{"pos":[51.2144452,4.402822928]},{"pos":[51.2052184,4.394183092]},{"pos":[51.21778169,4.404403141]},{"pos":[51.20362446,4.399403572]},{"pos":[51.21194987,4.403848022]},{"pos":[51.20640312,4.4005624]},{"pos":[51.2140311,4.405409624]},{"pos":[51.20604117,4.400093935]},{"pos":[51.2140264,4.403434565]},{"pos":[51.20871693,4.395193011]},{"pos":[51.21418988,4.412234132]},{"pos":[51.20171423,4.399958068]},{"pos":[51.2199784,4.404132558]},{"pos":[51.2048779,4.394811666]},{"pos":[51.21401282,4.409293532]},{"pos":[51.20557071,4.398599086]},{"pos":[51.21711614,4.406531141]},{"pos":[51.20051589,4.397617897]},{"pos":[51.21460631,4.403542422]},{"pos":[51.20318011,4.401307714]},{"pos":[51.21830424,4.409348713]},{"pos":[51.20906724,4.396888844]},{"pos":[51.21486899,4.404414938]},{"pos":[51.20133553,4.397371355]},{"pos":[51.21425705,4.412361412]},{"pos":[51.20168117,4.401083779]},{"pos":[51.21145652,4.405542157]},{"pos":[51.20839356,4.39524594]},{"pos":[51.21377373,4.408406256]},{"pos":[51.21010598,4.394717212]},{"pos":[51.21142025,4.412572186]},{"pos":[51.20604674,4.395505177]},{"pos":[51.21300511,4.405236187]},{"pos":[51.20582531,4.396066545]},{"pos":[51.21978896,4.403929789]},{"pos":[51.20403647,4.400050163]},{"pos":[51.21368252,4.410982846]},{"pos":[51.20058052,4.398629876]},{"pos":[51.21741902,4.40729968]},{"pos":[51.20071063,4.397693194]},{"pos":[51.22001408,4.408398888]},{"pos":[51.20184264,4.393161634]},{"pos":[51.21343301,4.404002345]},{"pos":[51.20528942,4.401922961]},{"pos":[51.21466685,4.402905148]},{"pos":[51.20178759,4.396088374]},{"pos":[51.21149793,4.406253562]},{"pos":[51.20222415,4.396414183]},{"pos":[51.21590363,4.40775903]},{"pos":[51.20988124,4.398144126]},{"pos":[51.22006385,4.408558532]},{"pos":[51.20411776,4.397663087]},{"pos":[51.21053106,4.406388958]},{"pos":[51.20729616,4.394164337]},{"pos":[51.21348985,4.404188793]},{"pos":[51.20732246,4.402010998]},{"pos":[51.21084109,4.407105784]},{"pos":[51.20605925,4.401139547]},{"pos":[51.21372889,4.41082487]},{"pos":[51.20353869,4.396261669]},{"pos":[51.21819087,4.409005526]},{"pos":[51.20707028,4.39685452]},{"pos":[51.21092435,4.406019875]},{"pos":[51.20151118,4.397103842]},{"pos":[51.21126627,4.409918855]},{"pos":[51.2089336,4.395725713]},{"pos":[51.21589044,4.403149031]},{"pos":[51.2052519,4.394618881]},{"pos":[51.21341232,4.407741261]},{"pos":[51.20391064,4.401799525]},{"pos":[51.20420746,4.394962752]}],"position":[51.209943396138634,4.4023929313168315],"color":{"r":0,"g":0,"b":255,"a":1},"limit":10000,"rendering":"heatmap","radius":30,"cypher":"MATCH (n:Place)\nRETURN n.location.x as latitude, n.location.y as longitude\nLIMIT 10000","hasSpatialPlugin":false,"spatialLayer":{"value":"","label":""},"ukey":"k98w76z3.ic","nodes":[{"value":"Continent","label":"Continent"},{"value":"Country","label":"Country"},{"value":"Person","label":"Person"},{"value":"Place","label":"Place"},{"value":"Region","label":"Region"},{"value":"Visit","label":"Visit"}]}}
{"name":"CovidTracingGraph Perspective 1","id":"cc537200-7fcf-11ea-998a-95525519ce77","categories":[{"id":0,"name":"Other","color":"#6B6B6B","size":1,"icon":"no-icon","labels":[],"properties":[],"hiddenLabels":[],"caption":[""]},{"id":1,"name":"Person","color":"#FFE081","size":1,"icon":"7704F5DB-A361-4067-9F2B-5EEDD2E9B9F4","labels":["Person"],"properties":[{"name":"name","exclude":false,"isCaption":true,"dataType":"string"},{"name":"id","exclude":false,"isCaption":false,"dataType":"string"},{"name":"confirmedtime","exclude":false,"isCaption":false,"dataType":"DateTime"},{"name":"healthstatus","exclude":false,"isCaption":false,"dataType":"string"},{"name":"betweenness","exclude":false,"isCaption":false,"dataType":"number"},{"name":"pagerank","exclude":false,"isCaption":false,"dataType":"number"},{"name":"louvain","exclude":false,"isCaption":false,"dataType":"bigint"}],"hiddenLabels":[],"caption":[""],"createdAt":"Thu Apr 16 2020","lastEditedAt":"Fri Apr 17 2020","styleRules":[{"type":"single","size":2,"minSize":1,"maxSize":2,"minColor":"#D5EEE2","midColor":"#81CCA8","maxColor":"#428C6A","color":"#F16667","basedOn":"healthstatus_string","condition":"equals","conditionValue":"Sick","applyColor":true,"applySize":true,"minSizeValue":"0","maxSizeValue":"5"},{"type":"single","size":1,"minSize":1,"maxSize":2,"minColor":"#D5EEE2","midColor":"#81CCA8","maxColor":"#428C6A","color":"#8DCC93","basedOn":"betweenness_number","condition":"greater-than","conditionValue":"0","applyColor":true}]},{"id":2,"name":"Place","color":"#C990C0","size":1,"icon":"8C0F002B-5A9E-416C-A642-0FA4AC4C0AF8","labels":["Place"],"properties":[{"name":"name","exclude":false,"isCaption":true,"dataType":"string"},{"name":"id","exclude":false,"isCaption":false,"dataType":"string"},{"name":"type","exclude":false,"isCaption":false,"dataType":"string"},{"name":"location","exclude":false,"isCaption":false,"dataType":"Point"}],"hiddenLabels":[],"caption":[""],"createdAt":"Thu Apr 16 2020","lastEditedAt":"Thu Apr 16 2020"},{"id":3,"name":"Stay","color":"#F79767","size":1,"icon":"DCC09BD2-9D48-4CD3-8704-6A1003F4D103","labels":["Stay"],"properties":[{"name":"starttime","exclude":false,"isCaption":true,"dataType":"DateTime"},{"name":"id","exclude":false,"isCaption":false,"dataType":"string"},{"name":"endtime","exclude":false,"isCaption":false,"dataType":"DateTime"},{"name":"duration","exclude":false,"isCaption":false,"dataType":"Duration"}],"hiddenLabels":[],"caption":[""],"createdAt":"Thu Apr 16 2020","lastEditedAt":"Thu Apr 16 2020"}],"categoryIndex":3,"relationshipTypes":[{"name":"STAYED_AT","id":"STAYED_AT"},{"name":"LOCATED_AT","id":"LOCATED_AT"},{"name":"HAS_STAY","id":"HAS_STAY"},{"name":"PERFORMS_VISIT","id":"PERFORMS_VISIT"},{"properties":[{"propertyKey":"duration","type":"VISITS","dataType":"Duration"},{"propertyKey":"starttime","type":"VISITS","dataType":"DateTime"},{"propertyKey":"id","type":"VISITS","dataType":"string"},{"propertyKey":"endtime","type":"VISITS","dataType":"DateTime"}],"name":"VISITS","id":"VISITS","color":"#8DCC93"},{"name":"PART_OF","id":"PART_OF"},{"properties":[{"propertyKey":"meettime","type":"MEETS","dataType":"Duration"}],"name":"MEETS","id":"MEETS","color":"#F16667","styleRules":[{"type":"single","size":1,"minSize":1,"maxSize":2,"minColor":"#D5EEE2","midColor":"#81CCA8","maxColor":"#428C6A","color":"#FFE081","basedOn":"_other","condition":"has-property"}]}],"palette":{"colors":["#FFE081","#C990C0","#F79767","#57C7E3","#F16667","#D9C8AE","#8DCC93","#ECB5C9","#4C8EDA","#FFC454","#DA7194","#569480","#848484","#D9D9D9"],"currentIndex":3},"createdAt":"Thu Apr 16 2020","lastEditedAt":"Thu Apr 16 2020","templates":[{"name":"Find community","id":"tmpl:1587412499541","createdAt":"Mon Apr 20 2020","text":"Find community nr $param","cypher":"Match (p:Person {louvain:$param})-[r]-(conn)\nreturn p, r, conn","params":[{"name":"$param","dataType":"Integer","suggestionLabel":"Person","suggestionProp":"louvain","cypher":null}],"hasCypherErrors":false},{"name":"Pagerank","id":"tmpl:1587532355143","createdAt":"Wed Apr 22 2020","text":"Find most important persons (pagerank)","cypher":"Match (p:Person)\nwith p\norder by p.pagerank\nlimit 10\nmatch path = (p)-[r]-(conn)\nreturn path;","params":[],"hasCypherErrors":false},{"name":"Betweenness","id":"tmpl:1587532557214","createdAt":"Wed Apr 22 2020","text":"Find most between","cypher":"Match (p:Person)\nwith p\norder by p.betweenness\nlimit 10\nmatch path = (p)-[r]-(conn)\nreturn path;","params":[],"hasCypherErrors":false}],"hiddenRelationshipTypes":[],"hiddenCategories":[],"hideUncategorisedData":false,"version":"1.2.1"}

A Neo4j Browser Guide to explore a Contact Tracing database

Background

tracking750 shutterstock 1687286332

I wrote a series of 4 blogposts about this topic. For more detail please refer to:

  • Part 1: how I go about creating a synthetic dataset, and import that into Neo4j

  • Part 2: how I can start running some interesting queries on the dataset, making me understand some of the interesting data points in there and questions that one might ask

  • Part 3: how I can use graph data science on this dataset, and understand some of the predictive metrics like pagerank, betweenness and use community detection to direct policies

  • Part 4: a number of loose ends that I touched on during my exploration - but surely did not exhaust.

There’s so much potential in this dataset, and in this problem domain in general. I feel like I have gone into the rabbit hole and have just resurfaced for some air. But who knows, maybe I will dive back in and do some more digging - after all, this is interesting stuff, and I love working on interesting topics.

In this guide I will show you the statements in an orderly sequence. Let’s start.

Creating the data: in a google sheet.

Here’s the full spreadsheet with synthetic data. The csv files are to be found here:

Now we can use these scripts to load the data.

All of these scripts are also on also on github: github

Importing the google sheet data into Neo4j

System requirements: these queries have been tested on Neo4j Enterprise 3.5.17 and 4.0.3, apoc 3.5.0.9 and 4.0.0.6 respectively, and gds 1.1.

Import the persons from the Person sheet:

load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=0" as csv
create (p:Person {id: csv.PersonId, name:csv.PersonName, healthstatus:csv.Healthstatus, confirmedtime:datetime(csv.ConfirmedTime), addresslocation:point({x: toFloat(csv.AddressLat), y: toFloat(csv.AddressLong)})});

Import the places from the places worksheet:

load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=205425553" as csv
create (p:Place {id: csv.PlaceId, name:csv.PlaceName, type:csv.PlaceType, location:point({x: toFloat(csv.Lat), y: toFloat(csv.Long)})});

Create a couple of indexes to make easier/faster to create the Visit nodes and relationships:

create index on :Place(id);
create index on :Place(location);
create index on :Place(name);
create index on :Person(id);
create index on :Person(name);
create index on :Person(healthstatus);
create index on :Person(confirmedtime);

Import the VISITS from the Visits sheet. Note that we are loading duplicate info here: both with a VISIT node, and a VISITS relationship. They can both be useful.

load csv with headers from
"https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&gid=1261126668" as csv
match (p:Person {id:csv.PersonId}), (pl:Place {id:csv.PlaceId})
create (p)-[:PERFORMS_VISIT]->(v:Visit {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)})-[:LOCATED_AT]->(pl)
create (p)-[vi:VISITS {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)}]->(pl)
set v.duration=duration.inSeconds(v.starttime,v.endtime)
set vi.duration=duration.inSeconds(vi.starttime,vi.endtime);

OPTIONAL: connect places to a Region, Country, Continent

create (r:Region {name:"Antwerp"})-[:PART_OF]->(c:Country {name:"Belgium"})-[:PART_OF]->(co:Continent {name:"Europe"});
match (r:Region {name:"Antwerp"}), (pl:Place)
create (pl)-[:PART_OF]->(r);

Querying Data

System requirements: these queries have been tested on Neo4j Enterprise 3.5.17 and 4.0.3, apoc 3.5.0.9 and 4.0.0.6 respectively, and gds 1.1.

The queries are also on github.

Who has a sick person potentially infected

match (p:Person {healthstatus:"Sick"})
with p
limit 1
match (p)--(v1:Visit)--(pl:Place)--(v2:Visit)--(p2:Person {healthstatus:"Healthy"})
return p.name as Spreader, v1.starttime as SpreaderStarttime, v2.endtime as SpreaderEndtime, pl.name as PlaceVisited, p2.name as Target, v2.starttime as TargetStarttime, v2.endtime as TargetEndttime;

Who has a sick person potentially infected - VISUAL

match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)-->(v1:Visit)-->(pl:Place)<--(v2:Visit)<--(p2:Person {healthstatus:"Healthy"})
return path;

Simplifying the query by using the VISITS relationship

match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)-[:VISITS]->(pl:Place)<-[:VISITS]-(p2:Person {healthstatus:"Healthy"})
return path;

Who has a sick person infected - with time overlap

The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.

Note that at the time or writing, apoc.coll.min and apoc.coll.max do not work on apoc 4.0.0.7 or later. Please use apoc version 4.0.0.6 which you can find over here

match (p:Person {healthstatus:"Sick"})-->(v1:Visit)-->(pl:Place)
with p,v1,pl
limit 10
match path = (p)-->(v1)-->(pl)<--(v2:Visit)<--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
     apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return path;

Who has a sick person infected - with time overlap AND SIMPLIFIED with the VISITS relationship. The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.

match (p:Person {healthstatus:"Sick"})-[v1:VISITS]->(pl:Place)
with p,v1,pl
limit 10
match path = (p)-[v1]->(pl)<-[v2:VISITS]-(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
     apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return path;

Who has a sick person infected - with time overlap +/- 2hrs. The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.

match (p:Person {healthstatus:"Sick"})-->(s1:Stay)-->(pl:Place)
with p,s1,pl
limit 10
match path = (p)-->(s1)-->(pl)<--(s2:Stay)<--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([s1.starttime.epochMillis, s2.starttime.epochMillis]) as maxStart,
     apoc.coll.min([s1.endtime.epochMillis, s2.endtime.epochMillis]) as minEnd
where maxStart-720000 <= minEnd+720000
return path;

Find sick person that has visited places since being infected

match (p:Person {healthstatus:"Sick"})-[visited]->(pl:Place)
where p.confirmedtime < visited.starttime
return p, visited, pl
limit 10;

Find connections between sick people

match (p1:Person {healthstatus:"Sick"}),(p2:Person {healthstatus:"Sick"})
where id(p1)<id(p2)
with p1, p2
match path = allshortestpaths ((p1)-[*]-(p2))
return path
limit 10;

How many sick and healthy people

match (p:Person)
return distinct p.healthstatus, count(*);

Which healthy person has the highest risk - based on amount over overlaptime with sick people

match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
     apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
return hp.name, hp.healthstatus, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc;

Which healthy person has the highest risk - based on amount over overlaptime with sick people - VISUAL

match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
     apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
with hp, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc
limit 10
match (hp)-[v]-(pl:Place)
return hp,v,pl;

Places with most sick visits

match (p:Person {healthstatus:"Sick"})-[v:VISITS]->(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, apoc.node.degree.in(pl,'VISITS') as totalnrofvisits
order by nrofsickvisits desc
limit 10
return placename, nrofsickvisits, totalnrofvisits, round(toFloat(nrofsickvisits)/toFloat(totalnrofvisits)*10000)/100 as percentageofsickvisits;

Places with most sick visits - VISUAL

match (p:Person {healthstatus:"Sick"})-[v:VISITS]->(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, pl
order by nrofsickvisits desc
limit 10
match (pl)<-[v]-(p:Person)
return pl,p,v;

Graph Data Science on the Contact Tracing Graph

Note that at the time of writing, these queries have been tested on Neo4j 3.5.17. Neo4j 4.0.3 currently not yet supports the GDS plugin.

All the scripts are of course also on github

REQUIREMENT: create the MEETS relationship based on OVERLAPTIME

This is a relationship between two PERSON nodes that we will need for our graph data science exercises.

match (p1:Person)-[v1:VISITS]->(pl:Place)<-[v2:VISITS]-(p2:Person)
where id(p1)<id(p2)
with p1, p2, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
    apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart <= minEnd
with p1, p2, sum(minEnd-maxStart) as meetTime
create (p1)-[:MEETS {meettime: duration({seconds: meetTime/1000})}]->(p2);

Graph Algo nr 1: calculating pagerank of Persons

:param limit => (10);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
    type: 'MEETS',
    orientation: 'NATURAL',
    properties: {}
}
},
relationshipWeightProperty: null,
dampingFactor: 0.85,
maxIterations: 20,
writeProperty: 'pagerank'
});
CALL gds.pageRank.write($config);

Look at the Person pagerank table results:

MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node[$config.writeProperty] AS pagerank, node.betweenness as betweenness
ORDER BY pagerank DESC
LIMIT 10;

Look at the Person pagerank graph results VISUALLY:

MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn

Graph Algo nr 2: calculating BETWEENNESS of Person nodes

:param limit => (20);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
    type: 'MEETS',
    orientation: 'NATURAL',
    properties: {}
}
},
writeProperty: 'betweenness'
});
CALL gds.alpha.betweenness.write($config);

Look at the Person betweenness results table:

MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node.pagerank as pagerank, node[$config.writeProperty] AS betweenness
ORDER BY betweenness DESC
LIMIT 10;

Look at the Person betweenness results VISUALLY:

MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn;

Graph Algo nr 3: LOUVAIN Community detection

Preparation for relationship weight property: needs integer, is currently set up as a duration!

MATCH p=()-[r:MEETS]->()
set r.meettimeinseconds=r.meettime.seconds;

Now we can calculate communities using Louvain:

:param limit => ( 50);
:param config => ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
    type: 'MEETS',
    orientation: 'NATURAL',
    properties: {
    meettimeinseconds: {
        property: 'meettimeinseconds',
        defaultValue: 1
    }
    }
}
},
relationshipWeightProperty: 'meettimeinseconds',
includeIntermediateCommunities: false,
seedProperty: '',
writeProperty: 'louvain'
});
CALL gds.louvain.write($config);

What are the different communities?

match (p:Person)
return distinct p.louvain, count(p)
order by count(p) desc;

Explore community 489:

match (p1:Person {louvain: 489})-[v:VISITS]->(pl:Place), (p1)-[m:MEETS]->(p2:Person)
return p1, p2, pl, v, m;
<style type="text/css" media="screen">
/*
.nodes-image {
margin:-100;
}
*/
@import url("//maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css");
.imageblock .content img, .image img {max-width: 900px;max-height: 300px;}
.deck h3, .deck h4 {display: block !important;margin-bottom:8px;margin-top:5px;}
.listingblock {margin:8px;}
.pull-bottom {position:relative;bottom:1em;}
.admonitionblock td.icon [class^="fa icon-"]{font-size:2.5em;text-shadow:1px 1px 2px rgba(0,0,0,.5);cursor:default}
.admonitionblock td.icon .icon-note:before{content:"\f05a";color:#19407c}
.admonitionblock td.icon .icon-tip:before{content:"\f0eb";text-shadow:1px 1px 2px rgba(155,155,0,.8);color:#111}
.admonitionblock td.icon .icon-warning:before{content:"\f071";color:#bf6900}
.admonitionblock td.icon .icon-caution:before{content:"\f06d";color:#bf3400}
.admonitionblock td.icon .icon-important:before{content:"\f06a";color:#bf0000}
.admonitionblock.note.speaker { display:none; }
</style>
<style type="text/css" media="screen">
/* #editor.maximize-editor .CodeMirror-code { font-size:24px; line-height:26px; } */
</style>
<article class="guide" ng-controller="AdLibDataController">
<carousel class="deck container-fluid">
<!--slide class="row-fluid">
<div class="col-sm-3">
<h3>A Neo4j Browser Guide to explore a Contact Tracing database</h3>
<p class="lead">Information</p>
<!dl>
</dl>
</div>
<div class="col-sm-9">
<figure>
<img style="width:300px" src=""/>
</figure>
</div>
</slide-->
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Background</h3>
<br/>
<div>
<div class="imageblock" style="float: right;">
<div class="content">
<img src="https://gcm.rmnet.be/imgcontrol/c750-d511/clients/rmnet/content/medias/tracking750_shutterstock_1687286332.jpg" alt="tracking750 shutterstock 1687286332" width="200">
</div>
</div>
<div class="paragraph">
<p>I wrote a series of 4 blogposts about this topic. For more detail please refer to:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://blog.bruggen.com/2020/04/covid-19-contact-tracing-blogpost-part.html">Part 1</a>: how I go about creating a synthetic dataset, and import that into Neo4j</p>
</li>
<li>
<p><a href="https://blog.bruggen.com/2020/04/covid-19-contact-tracing-blogpost-part_21.html">Part 2</a>: how I can start running some interesting queries on the dataset, making me understand some of the interesting data points in there and questions that one might ask</p>
</li>
<li>
<p><a href="https://blog.bruggen.com/2020/04/covid-19-contact-tracing-blogpost-part_61.html">Part 3</a>: how I can use graph data science on this dataset, and understand some of the predictive metrics like pagerank, betweenness and use community detection to direct policies</p>
</li>
<li>
<p><a href="https://blog.bruggen.com/2020/04/covid-19-contact-tracing-blogpost-part_0.html">Part 4</a>: a number of loose ends that I touched on during my exploration - but surely did not exhaust.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>There&#8217;s so much potential in this dataset, and in this problem domain in general. I feel like I have gone into the rabbit hole and have just resurfaced for some air. But who knows, maybe I will dive back in and do some more digging - after all, this is interesting stuff, and I love working on interesting topics.</p>
</div>
<div class="paragraph">
<p>In this guide I will show you the statements in an orderly sequence. Let&#8217;s start.</p>
</div>
</div>
</div>
</slide>
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Creating the data: in a google sheet.</h3>
<br/>
<div>
<div class="paragraph">
<p>Here&#8217;s the <a href="https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/edit#gid=0">full spreadsheet with synthetic data</a>.
The csv files are to be found here:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=0">person sheet</a></p>
</li>
<li>
<p><a href="https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=205425553">place sheet</a></p>
</li>
<li>
<p><a href="https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=1261126668">visits sheet</a></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Now we can use these scripts to load the data.</p>
</div>
<div class="paragraph">
<p>All of these scripts are also on also on github: <a href="https://gist.github.com/rvanbruggen/a916e640336f91767be23ea9288fccc3#file-1-contacttracing-import-cql">github</a></p>
</div>
</div>
</div>
</slide>
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Importing the google sheet data into Neo4j</h3>
<br/>
<div>
<div class="paragraph">
<p><em>System requirements: these queries have been tested on Neo4j Enterprise 3.5.17 and 4.0.3, apoc 3.5.0.9 and 4.0.0.6 respectively, and gds 1.1.</em></p>
</div>
<div class="paragraph">
<p>Import the persons from the Person sheet:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=0" as csv
create (p:Person {id: csv.PersonId, name:csv.PersonName, healthstatus:csv.Healthstatus, confirmedtime:datetime(csv.ConfirmedTime), addresslocation:point({x: toFloat(csv.AddressLat), y: toFloat(csv.AddressLong)})});<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Import the places from the places worksheet:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->load csv with headers from
"https://docs.google.com/spreadsheets/u/0/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=205425553" as csv
create (p:Place {id: csv.PlaceId, name:csv.PlaceName, type:csv.PlaceType, location:point({x: toFloat(csv.Lat), y: toFloat(csv.Long)})});<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Create a couple of indexes to make easier/faster to create the Visit nodes and relationships:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->create index on :Place(id);
create index on :Place(location);
create index on :Place(name);
create index on :Person(id);
create index on :Person(name);
create index on :Person(healthstatus);
create index on :Person(confirmedtime);<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Import the VISITS from the Visits sheet. Note that we are loading duplicate info here: both with a VISIT node, and a VISITS relationship. They can both be useful.</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->load csv with headers from
"https://docs.google.com/spreadsheets/d/1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE/export?format=csv&amp;id=1R-XVuynPsOWcXSderLpq3DacZdk10PZ8v6FiYGTncIE&amp;gid=1261126668" as csv
match (p:Person {id:csv.PersonId}), (pl:Place {id:csv.PlaceId})
create (p)-[:PERFORMS_VISIT]-&gt;(v:Visit {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)})-[:LOCATED_AT]-&gt;(pl)
create (p)-[vi:VISITS {id:csv.VisitId, starttime:datetime(csv.StartTime), endtime:datetime(csv.EndTime)}]-&gt;(pl)
set v.duration=duration.inSeconds(v.starttime,v.endtime)
set vi.duration=duration.inSeconds(vi.starttime,vi.endtime);<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>OPTIONAL: connect places to a Region, Country, Continent</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->create (r:Region {name:"Antwerp"})-[:PART_OF]-&gt;(c:Country {name:"Belgium"})-[:PART_OF]-&gt;(co:Continent {name:"Europe"});
match (r:Region {name:"Antwerp"}), (pl:Place)
create (pl)-[:PART_OF]-&gt;(r);<!--/code--></pre>
</div>
</div>
</div>
</div>
</slide>
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Querying Data</h3>
<br/>
<div>
<div class="paragraph">
<p><em>System requirements: these queries have been tested on Neo4j Enterprise 3.5.17 and 4.0.3, apoc 3.5.0.9 and 4.0.0.6 respectively, and gds 1.1.</em></p>
</div>
<div class="paragraph">
<p>The queries are also on <a href="https://gist.github.com/rvanbruggen/a916e640336f91767be23ea9288fccc3#file-2-contracttracing-queries-cql">github</a>.</p>
</div>
<h4>Who has a sick person potentially infected</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})
with p
limit 1
match (p)--(v1:Visit)--(pl:Place)--(v2:Visit)--(p2:Person {healthstatus:"Healthy"})
return p.name as Spreader, v1.starttime as SpreaderStarttime, v2.endtime as SpreaderEndtime, pl.name as PlaceVisited, p2.name as Target, v2.starttime as TargetStarttime, v2.endtime as TargetEndttime;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Who has a sick person potentially infected - VISUAL</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)--&gt;(v1:Visit)--&gt;(pl:Place)&lt;--(v2:Visit)&lt;--(p2:Person {healthstatus:"Healthy"})
return path;<!--/code--></pre>
</div>
</div>
<h4>Simplifying the query by using the VISITS relationship</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})
with p
limit 1
match path = (p)-[:VISITS]-&gt;(pl:Place)&lt;-[:VISITS]-(p2:Person {healthstatus:"Healthy"})
return path;<!--/code--></pre>
</div>
</div>
<h4>Who has a sick person infected - with time overlap</h4>
<div class="paragraph">
<p>The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.</p>
</div>
<div class="paragraph">
<p><em>Note that at the time or writing, apoc.coll.min and apoc.coll.max do not work on apoc 4.0.0.7 or later. Please use apoc version 4.0.0.6 which you can find <a href="https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/tag/4.0.0.6">over here</a></em></p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})--&gt;(v1:Visit)--&gt;(pl:Place)
with p,v1,pl
limit 10
match path = (p)--&gt;(v1)--&gt;(pl)&lt;--(v2:Visit)&lt;--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart &lt;= minEnd
return path;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Who has a sick person infected - with time overlap AND SIMPLIFIED with the VISITS relationship. The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})-[v1:VISITS]-&gt;(pl:Place)
with p,v1,pl
limit 10
match path = (p)-[v1]-&gt;(pl)&lt;-[v2:VISITS]-(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart &lt;= minEnd
return path;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Who has a sick person infected - with time overlap +/- 2hrs. The latest of start times must occur before (or at the same time) as the earliest of the end times for the ranges to overlap.</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})--&gt;(s1:Stay)--&gt;(pl:Place)
with p,s1,pl
limit 10
match path = (p)--&gt;(s1)--&gt;(pl)&lt;--(s2:Stay)&lt;--(p2:Person {healthstatus:"Healthy"})
WITH path, apoc.coll.max([s1.starttime.epochMillis, s2.starttime.epochMillis]) as maxStart,
apoc.coll.min([s1.endtime.epochMillis, s2.endtime.epochMillis]) as minEnd
where maxStart-720000 &lt;= minEnd+720000
return path;<!--/code--></pre>
</div>
</div>
<h4>Find sick person that has visited places since being infected</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})-[visited]-&gt;(pl:Place)
where p.confirmedtime &lt; visited.starttime
return p, visited, pl
limit 10;<!--/code--></pre>
</div>
</div>
<h4>Find connections between sick people</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p1:Person {healthstatus:"Sick"}),(p2:Person {healthstatus:"Sick"})
where id(p1)&lt;id(p2)
with p1, p2
match path = allshortestpaths ((p1)-[*]-(p2))
return path
limit 10;<!--/code--></pre>
</div>
</div>
<h4>How many sick and healthy people</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person)
return distinct p.healthstatus, count(*);<!--/code--></pre>
</div>
</div>
<h4>Which healthy person has the highest risk - based on amount over overlaptime with sick people</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]-&gt;(pl:Place)&lt;-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart &lt;= minEnd
return hp.name, hp.healthstatus, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Which healthy person has the highest risk - based on amount over overlaptime with sick people - VISUAL</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (hp:Person {healthstatus:"Healthy"})-[v1:VISITS]-&gt;(pl:Place)&lt;-[v2:VISITS]-(sp:Person {healthstatus:"Sick"})
with hp, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart &lt;= minEnd
with hp, sum(minEnd-maxStart) as overlaptime
order by overlaptime desc
limit 10
match (hp)-[v]-(pl:Place)
return hp,v,pl;<!--/code--></pre>
</div>
</div>
<h4>Places with most sick visits</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})-[v:VISITS]-&gt;(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, apoc.node.degree.in(pl,'VISITS') as totalnrofvisits
order by nrofsickvisits desc
limit 10
return placename, nrofsickvisits, totalnrofvisits, round(toFloat(nrofsickvisits)/toFloat(totalnrofvisits)*10000)/100 as percentageofsickvisits;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Places with most sick visits - VISUAL</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person {healthstatus:"Sick"})-[v:VISITS]-&gt;(pl:Place)
with distinct pl.name as placename, count(v) as nrofsickvisits, pl
order by nrofsickvisits desc
limit 10
match (pl)&lt;-[v]-(p:Person)
return pl,p,v;<!--/code--></pre>
</div>
</div>
</div>
</div>
</slide>
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Graph Data Science on the Contact Tracing Graph</h3>
<br/>
<div>
<div class="paragraph">
<p><em>Note that at the time of writing, these queries have been tested on Neo4j 3.5.17. Neo4j 4.0.3 currently not yet supports the GDS plugin.</em></p>
</div>
<div class="paragraph">
<p>All the scripts are of course also on <a href="https://gist.github.com/rvanbruggen/a916e640336f91767be23ea9288fccc3#file-3-contracttracing-analytics-cql">github</a></p>
</div>
<h4>REQUIREMENT: create the MEETS relationship based on OVERLAPTIME</h4>
<div class="paragraph">
<p>This is a relationship between two PERSON nodes that we will need for our graph data science exercises.</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p1:Person)-[v1:VISITS]-&gt;(pl:Place)&lt;-[v2:VISITS]-(p2:Person)
where id(p1)&lt;id(p2)
with p1, p2, apoc.coll.max([v1.starttime.epochMillis, v2.starttime.epochMillis]) as maxStart,
apoc.coll.min([v1.endtime.epochMillis, v2.endtime.epochMillis]) as minEnd
where maxStart &lt;= minEnd
with p1, p2, sum(minEnd-maxStart) as meetTime
create (p1)-[:MEETS {meettime: duration({seconds: meetTime/1000})}]-&gt;(p2);<!--/code--></pre>
</div>
</div>
<h4>Graph Algo nr 1: calculating pagerank of Persons</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->:param limit =&gt; (10);
:param config =&gt; ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {}
}
},
relationshipWeightProperty: null,
dampingFactor: 0.85,
maxIterations: 20,
writeProperty: 'pagerank'
});
CALL gds.pageRank.write($config);<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Look at the Person pagerank table results:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node[$config.writeProperty] AS pagerank, node.betweenness as betweenness
ORDER BY pagerank DESC
LIMIT 10;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Look at the Person pagerank graph results VISUALLY:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn<!--/code--></pre>
</div>
</div>
<h4>Graph Algo nr 2: calculating BETWEENNESS of Person nodes</h4>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->:param limit =&gt; (20);
:param config =&gt; ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {}
}
},
writeProperty: 'betweenness'
});
CALL gds.alpha.betweenness.write($config);<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Look at the Person betweenness results table:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->MATCH (node)
WHERE not(node[$config.writeProperty] is null)
RETURN node.name as name, node.pagerank as pagerank, node[$config.writeProperty] AS betweenness
ORDER BY betweenness DESC
LIMIT 10;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Look at the Person betweenness results VISUALLY:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->MATCH (node)
WHERE not(node[$config.writeProperty] is null)
with node, node[$config.writeProperty] AS score
ORDER BY score DESC
LIMIT 10
match (node)-[r]-(conn)
return node, r, conn;<!--/code--></pre>
</div>
</div>
<h4>Graph Algo nr 3: LOUVAIN Community detection</h4>
<div class="paragraph">
<p>Preparation for relationship weight property: needs integer, is currently set up as a duration!</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->MATCH p=()-[r:MEETS]-&gt;()
set r.meettimeinseconds=r.meettime.seconds;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Now we can calculate communities using Louvain:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->:param limit =&gt; ( 50);
:param config =&gt; ({
nodeProjection: 'Person',
relationshipProjection: {
relType: {
type: 'MEETS',
orientation: 'NATURAL',
properties: {
meettimeinseconds: {
property: 'meettimeinseconds',
defaultValue: 1
}
}
}
},
relationshipWeightProperty: 'meettimeinseconds',
includeIntermediateCommunities: false,
seedProperty: '',
writeProperty: 'louvain'
});
CALL gds.louvain.write($config);<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>What are the different communities?</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p:Person)
return distinct p.louvain, count(p)
order by count(p) desc;<!--/code--></pre>
</div>
</div>
<div class="paragraph">
<p>Explore community 489:</p>
</div>
<div class="listingblock">
<div class="content">
<pre mode="cypher" class="highlight pre-scrollable programlisting cm-s-neo code runnable standalone-example ng-binding" data-lang="cypher" lang="cypher"><!--code class="cypher language-cypher"-->match (p1:Person {louvain: 489})-[v:VISITS]-&gt;(pl:Place), (p1)-[m:MEETS]-&gt;(p2:Person)
return p1, p2, pl, v, m;<!--/code--></pre>
</div>
</div>
</div>
</div>
</slide>
<slide class="row-fluid">
<div class="col-sm-12">
<h3>Some links</h3>
<br/>
<div>
<div class="ulist">
<ul>
<li>
<p><a href="http://blog.bruggen.com">Rik&#8217;s blog</a></p>
</li>
<li>
<p><a href="https://twitter.com/rvanbruggen">Rik on Twitter</a></p>
</li>
<li>
<p><a href="http://graphistania.com">The Graphistania podcast</a></p>
</li>
</ul>
</div>
</div>
</div>
</slide>
</carousel>
</article>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment