Skip to content

Instantly share code, notes, and snippets.

@rvanbruggen
rvanbruggen / guide_to_shakespeare.mdx
Last active February 15, 2023 15:44
Shakespeare Network

Network Analysis of Shakespeare's plays

What do you do when a new colleague starts to talk to you about how they would love to experiment with getting a dataset about Romeo & Juliet into a graph? Yes, that's right, you get your graph boots on, and you start looking out for a great dataset that you could play around with. And as usual, one things leads to another (it's all connected, remember!), and you end up with this incredible experiment that twists, turns and meanders into something fascinating. That's what happened here too. William Shakespeare

Finding a Data source

That was so easy. I very quickly located a Dataset on Kaggle that I thought would be really interesting. It's a comma-separated file, about 110k lines long and 10MB in size, that holds all the lines that Shakespeare wr

@rvanbruggen
rvanbruggen / 1-newsanalysis.md
Last active April 26, 2021 19:36
News Analysis with Neo4j, APOC and Google Cloud NLP
@rvanbruggen
rvanbruggen / 1-playlist-importer.py
Last active July 13, 2023 11:32
Spotify Playlist importer, queries, and dashboard
import spotipy
from neo4j import GraphDatabase
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
# ------------------------------------ Configuration parameters ------------------------------------ #
user_id = "<<YOUR SPOTIFY USER ID>>" # Spotify user ID.
client = "<<YOUR SPOTIFY CLIENT ID>>" # Spotify client ID.
secret = "<<YOUR SPOTIFY CLIENT SECRET>>" # Spotify client secret.
playlist_uri = "spotify:playlist:1eCqsRrwBAFc2lf5ZLGa5m" # public playlist with songs to be sorted.
neo4j_url = "neo4j://localhost:7687" # bolt url of the neo4j database.
@rvanbruggen
rvanbruggen / 1-calendargraph.adoc
Last active October 12, 2023 06:49
Analysing a google calendar in Neo4j
@rvanbruggen
rvanbruggen / 1-playlist-importer-and-analyser.py
Last active October 16, 2020 07:36
Spotify Playlist Joy
import spotipy
from neo4j import GraphDatabase
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
# ------------------------------------ Configuration parameters ------------------------------------ #
user_id = "YOUR USER_ID" # Spotify user ID.
client = "YOUR CLIENT" # Spotify client ID.
secret = "YOUR SECRET" # Spotify client secret.
# playlist_uri = "spotify:playlist:1eCqsRrwBAFc2lf5ZLGa5m" # LONG original public playlist with songs to be sorted.
playlist_uri = "spotify:playlist:1BTunw40NV9HgFpLXQ7hpm" # SHORT original public playlist with songs to be sorted.
@rvanbruggen
rvanbruggen / 1 - FinCEN files in Zeppelin Notebook.md
Last active October 12, 2020 21:51
FinCEN files in Neo4j+Zeppelin

Using Zeppelin with Neo4j to analyse the FinCEN Files

Last week, we got another great and widely publicised case of Graph Databases' usefullness throw our way. The ICIJ published their FinCEN Files research, and on top of allowing you to explore the data on their website they also published an anonymised subset of the data as a series of CSV/JSON files. My friends and colleagues Michael Hunger, Will Lyon and the rest of the team, helped with the process of making this subset available as a Neo4j database (see this github repo), and there's even a super easy FinCEN Files Neo4j Sandbox that you can spin up in no time for some investigation fun.

So of course I had to take this data for a spin myself - it seems real

@rvanbruggen
rvanbruggen / fincen_browser_guide.mdx
Created September 24, 2020 07:26
FinCen Queries
@rvanbruggen
rvanbruggen / exponential_growth.adoc
Last active September 24, 2020 07:28
Exponential growth in Neo4j

Exponential growth in Neo4j

With the current surges of the Covid-19 Pandemic globally, there is a huge amount of debate raging in our societies - everywhere. It’s almost as if the duality between left and right that has been dividing many political spectra in the past few years, is now also translating itself into a duality that is all about more freedom for the individual (and potentially - a higher spread of the SARS-CoV-2 virus), versus more restrictions for the individual. It’s such a difficult debate - with no clear definitive outcome that I know of. There’s just too many uncertainties and variations in the pandemic - I personally don’t see how you can make generic statements about it very easily.

One thing I do know though, is that very smart and loveable people, in my own social and professional circle and beyond, seem to be confused by some of the data. Very often, they make seemingly rational arguments about the numbers that are seeing - but ig

@rvanbruggen
rvanbruggen / opentrialsqueries.cql
Last active September 18, 2020 14:45
Queries to run against the OpenTrials database, once imported into Neo4j
//Look at schema
call db.schema.visualization
//what is connected and how
match (n)-[r]->(m)
return distinct labels(n), type(r), labels(m)
//how many nodes of each label
match (n)
return labels(n), count(n)