Skip to content

Instantly share code, notes, and snippets.

@tekiegirl
Last active Aug 29, 2015
Embed
What would you like to do?

Baby Names Graph

Initial Data Setup

MERGE (y13:Year {content: 2013})
MERGE (y12:Year {content: 2012})
MERGE (y03:Year {content: 2003})
CREATE INDEX ON :Name(name);
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "https://dl.dropboxusercontent.com/u/2900504/babynames.csv" AS csvLine

WITH csvLine
MERGE (name:Name {content: trim(csvLine.Name)})

WITH name, csvLine
MERGE (gen:Gender {content: trim(csvLine.Gender)})

WITH name, csvLine, gen
MATCH (y:Year {content: 2013})
MERGE (name)-[:RANKED {rank: toInt(csvLine.Rank), count: toInt(csvLine.Count)}]->(y)
MERGE (name)-[:OF_GENDER]->(gen)

WITH name, csvLine
MATCH (y12:Year {content: 2012})
MERGE (name)-[:RANKED {rank: toInt(csvLine.Rank2012)}]->(y12)

WITH name, csvLine
MATCH (y03:Year {content: 2003})
MERGE (name)-[:RANKED {rank: toInt(csvLine.Rank2003)}]->(y03)
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "https://dl.dropboxusercontent.com/u/2900504/historicname_tcm77-254032-girls.csv" AS csvLine

WITH csvLine
MERGE (name:Name {content: trim(csvLine.Name)})

WITH name, csvLine
MERGE (gen:Gender {content: trim(csvLine.Gender)})

WITH name, csvLine, gen
MERGE (year:Year {content: toInt(csvLine.Year)})

WITH name, csvLine, gen, year
MERGE (name)-[:RANKED {rank: toInt(csvLine.Rank)}]->(year)
MERGE (name)-[:OF_GENDER]->(gen)

RETURN count(*) as All_Names;

What are the top 10 boy’s names of 2013?

MATCH (y:Year {content: 2013})<-[r:RANKED]-(n:Name)-[OF_GENDER]->(g:Gender {content: 'Boy'})
WHERE r.rank <= 10
RETURN n.content AS Name, r.rank AS Rank
ORDER BY r.rank

What are the top 10 girl’s names of 2013?

MATCH (y:Year {content: 2013})<-[r:RANKED]-(n:Name)-[OF_GENDER]->(g:Gender {content: 'Girl'})
WHERE r.rank <= 10
RETURN n.content AS Name, r.rank AS Rank
ORDER BY r.rank

Which names have gone up the most since 2012?

This does not take into account names that are not ranked in both years.

MATCH (:Year {content: 2013})<-[r13:RANKED]-(n:Name)-[r12:RANKED]->(:Year {content: 2012})
WHERE r13.rank < r12.rank
RETURN n.content AS Name, r13.rank AS Rank_2013, r12.rank AS Rank_2012, (r13.rank - r12.rank) * -1 AS Gain
ORDER BY Gain DESC
LIMIT 10

Which names have gone up the most since 2003?

This does not take into account names that are not ranked in both years.

MATCH (:Year {content: 2013})<-[r13:RANKED]-(n:Name)-[r03:RANKED]->(:Year {content: 2003})
WHERE r13.rank < r03.rank
RETURN n.content AS Name, r13.rank AS Rank_2013, r03.rank AS Rank_2003, (r13.rank - r03.rank) * -1 AS Gain
ORDER BY Gain DESC
LIMIT 10

Which names from 2013 were also in the top 100 in 1994?

MATCH (:Year {content: 2013})<-[r1:RANKED]-(n:Name)-[r2:RANKED]->(:Year {content: 1994})
RETURN n.content AS Name, r1.rank AS Rank_2013, r2.rank AS Rank_1994, (r1.rank - r2.rank) * -1 AS Difference
ORDER BY Difference DESC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment