I’ll need to interact with the Neo 4J database using the Python driver.
The package that I used was the neo4J
library.
Install that and import the graph database driver.
from neo4j import GraphDatabase
Using this driver, I connect to the database using the credentials that I've set up for my Neo4J instance.
NEO4J_URI = "bolt://localhost:7687"
NEO4J_USER = "neo4j"
NEO4J_PASSWORD = "password"
NEO4J_DATABASE = "trains"
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD), database=NEO4J_DATABASE)
Now we run the driver to execute a query.
In this particular query, I create a projection for the baseline.
query = f"""
MATCH (s1:Stop)-[r:NEXT_STOP]->(s2:Stop)
WITH gds.graph.project('{projection_name}', s1, s2) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS relationships
"""
session.run(query)
I then use this projection to calculate the baseline BC scores.
# Calculate the betweenness centrality on the baseline projection
query = f"""
CALL gds.betweenness.stream('{projection_name}')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS station, score
"""
result = session.run(query)
baseline = pd.Series({record['station']: record['score'] for record in result})
A little bit of forward thinking. Since I'll be using the pandas library to make statistical calculations on these scores, I would need to create a data frame or series and return it as part of my function.
Here are some of the gotchas that I ran into whilst developing the script.
It seems like there is a deprecated function to create cypher projections.
The new function to call it is: gds.graph.project
The documentation of it is here:
The queries create an error whenever you want to create a projection with the same name as one that exists.
An error also occurs when you try to drop a projection that does not exist.
This means that when I create or drop the projection, I would need to wrap it up in an existence check.
CALL gds.graph.exists('{projection_name}') YIELD exists
WHERE exists
CALL gds.graph.drop('{projection_name}') YIELD graphName
RETURN graphName;