Bestuurders Beursgenoteerde bedrijven in Neo4j - Intro
Uit 2018
Eerst alle nodes vanuit google sheet halen en met "Node" label uitrusten:
load csv with headers from "https://docs.google.com/spreadsheets/u/0/d/1T8vt1PvdJqvOTj_5kRLsjbOUDNId97foQkNB6U6o5_8/export?format=csv&id=1T8vt1PvdJqvOTj_5kRLsjbOUDNId97foQkNB6U6o5_8&gid=1067482365" as csv
create (n:Node)
set n = csv;
Daarna conversie van "Node" label naar "Person" en "Company" label:
match (n:Node)
where n.type = "person"
set n:Person
remove n.type;
match (n:Node)
where n.type = "company"
set n:Company
remove n.type;
Eerst relaties lezen uit google sheet, dan start- en eindnode opzoeken, daarna generieke "RELATED_TO" relatie maken:
load csv with headers from "https://docs.google.com/spreadsheets/u/0/d/1T8vt1PvdJqvOTj_5kRLsjbOUDNId97foQkNB6U6o5_8/export?format=csv&id=1T8vt1PvdJqvOTj_5kRLsjbOUDNId97foQkNB6U6o5_8&gid=0" as csv
match (startnode:Node {id: csv.from}), (endnode:Node {id: csv.to})
create (startnode)-[:RELATED_TO {mandate: csv.mandate, weight: toInteger(csv.weight)}]->(endnode);
Daarna verschillende relatietypes omvormen naar eigen type, incl de gewichten:
match (n)-[r:RELATED_TO]->(m)
where r.mandate = "bestuurder"
create (n)-[eo:EXECUTIVE_OF {weight: r.weight}]->(m);
match (n)-[r:RELATED_TO]->(m)
where r.mandate = "ceo"
create (n)-[eo:CEO_OF {weight: r.weight}]->(m);
match (n)-[r:RELATED_TO]->(m)
where r.mandate = "voorzitter"
create (n)-[eo:CHAIRMAN_OF {weight: r.weight}]->(m);
match (n)-[r:RELATED_TO]->(m)
where r.mandate = "ondervoorzitter"
create (n)-[eo:VICE_CHAIRMAN_OF {weight: r.weight}]->(m);
//links between 2 companies
match (c1:Company),(c2:Company),
path = allshortestpaths((c1)-[*]-(c2))
where c1 <> c2
return path
limit 10
Met behulp van de Neo4j graph data science library en de Neo4j graph data science playground. Zie ook de github pagine en de installatiepagine voor Neo4j Desktop Graphapps.
Algo draaien en resultaten opslaan:
:param limit => ( 50);
:param config => ({
nodeProjection: '*',
relationshipProjection: {
relType: {
type: '*',
orientation: 'UNDIRECTED',
properties: {}
}
},
relationshipWeightProperty: null,
dampingFactor: 0.85,
maxIterations: 20,
writeProperty: 'pagerank'
});
CALL gds.pageRank.write($config);
Resultaten bekijken:
MATCH (node)
WHERE exists(node.`pagerank`)
RETURN node, node.`pagerank` AS score
ORDER BY score DESC
LIMIT toInteger($limit);
Algo draaien en resultaten opslaan:
:param limit => ( 50);
:param config => ({
nodeProjection: '*',
relationshipProjection: {
relType: {
type: '*',
orientation: 'UNDIRECTED',
properties: {}
}
},
writeProperty: 'betweenness'
});
CALL gds.alpha.betweenness.write($config);
Resultaten bekijken:
MATCH (node)
WHERE exists(node.`betweenness`)
RETURN node, node.`betweenness` AS score
ORDER BY score DESC
LIMIT toInteger($limit);
Algo draaien en resultaten opslaan:
:param limit => ( 50);
:param config => ({
nodeProjection: '*',
relationshipProjection: {
relType: {
type: '*',
orientation: 'UNDIRECTED',
properties: {}
}
},
writeProperty: 'closeness'
});
CALL gds.alpha.closeness.write($config);
Resultaten bekijken:
MATCH (node)
WHERE exists(node.`closeness`)
RETURN node, node.`closeness` AS score
ORDER BY score DESC
LIMIT toInteger($limit);
Algo draaien en resultaten opslaan:
:param limit => ( 50);
:param config => ({
nodeProjection: '*',
relationshipProjection: {
relType: {
type: '*',
orientation: 'UNDIRECTED',
properties: {}
}
},
relationshipWeightProperty: null,
includeIntermediateCommunities: false,
seedProperty: '',
writeProperty: 'louvain'
});
CALL gds.louvain.write($config);
Resultaten bekijken:
MATCH (node)
WHERE exists(node.`louvain`)
WITH node, node.`louvain` AS community
RETURN node,
CASE WHEN apoc.meta.type(community) = "long[]" THEN community[-1] ELSE community END AS community,
CASE WHEN apoc.meta.type(community) = "long[]" THEN community ELSE null END as communities
LIMIT toInteger($limit);
CALL apoc.export.cypher.all("all-plain.cypher", {
format: "plain",
useOptimizations: {type: "UNWIND_BATCH", unwindBatchSize: 20}
})
YIELD file, batches, source, format, nodes, relationships, properties, time, rows, batchSize
RETURN file, batches, source, format, nodes, relationships, properties, time, rows, batchSize;
Zie dit artikel over densiteit.
match (n)-[r]->()
with count(distinct n) as nrofnodes,
count(distinct r) as nrofrels
return nrofnodes, nrofrels,
nrofrels/(nrofnodes * (nrofnodes - 1.0)) as density
Conclusie lijkt te zijn dat we te maken hebben met een zeer lage dichtheid.
Hoe kunnen we weten of de bedrijfnodes sterk met elkaar verweven zijn. We kijken naar de verdeling van de graad van deze nodes.
Zie histogram in de google sheet.
match (c:Company)
return c.label, apoc.node.degree(c,'<') as degree
order by degree desc;
Zie histogram in de de google sheet.
match (p:Person)
return p.label, apoc.node.degree(p,'>') as degree
order by degree desc;
match (p:Person)
return distinct p.louvain, count(p)
order by count(p) desc
3 hops diep:
match path = (p:Person {louvain: 993})-[*..3]-(conn)
return path;
2 hops diep:
match path = (p:Person {louvain: 1015})-[*..2]-(conn)
return path;
match (p:Person)
with distinct p.louvain as louvains, count(p) as count
order by count desc
limit 10
unwind louvains as onelouvain
match (p:Person {louvain: onelouvain})--(c:Company)
return p.louvain, collect(distinct(p.label)), collect(distinct(c.label));
Zie dit artikel over het Pagerank algoritme. === Wie/wat is het belangrijkst volgens pagerank
match (n)
return n.label, head(labels(n)), n.pagerank, n.betweenness, n.closeness, n.louvain
order by n.pagerank desc
limit 10;
match (n:Company)
return n.label, n.pagerank, n.betweenness, n.closeness, n.louvain
order by n.pagerank desc
limit 10
match (n:Person)
return n.label, n.pagerank, n.betweenness, n.closeness, n.louvain
order by n.pagerank desc
limit 10
Vergelijk volgende twee resultaten:
match (p:Person)-[r]-(c:Company)
return p.label, r.weight, c.label, p.pagerank
order by p.pagerank desc;
vs.
match (p:Person)-[r]-(c:Company)
return p.label, r.weight, c.label, p.pagerank
order by r.weight desc;
Dit duidt op een verschil tussen de structurele en quantitatieve belangen van personen/entiteiten in het netwerk?
match (p:Person)
with p, p.betweenness as betweenness
order by betweenness desc
limit 10
match path = (p)-[*..2]-(conn)
return path;
Persoon | Betweenness |
---|---|
"Hilde Laga" |
75631.80039682535 |
"Frank Donck" |
68489.61169108667 |
"Luc Bertrand" |
45942.1094322345 |
"Michèle Sioen" |
37586.583730158636 |
"Johan Deschuyffeleer" |
31068.609920634928 |
"Luc Missorten" |
29610.0 |
"Koen Hoffman" |
28737.442857142916 |
"Philippe Vlerick" |
28364.0999999999 |
"Pierre Macharis" |
25421.366971916923 |
"Marion Debruyne" |
25020.31855921862 |
match (p:Person)
with p, p.betweenness as betweenness
order by betweenness desc
limit 10
match path = (p)-[*..3]-(conn)
return path