Datasets for Graph Hack 2014.
All stores are Neo4j 2.1.3.
Transportation
Flights
What is related, and how?
Flight ORIGIN Airport
Flight DESTINATION Airport
Flight CARRIER Carrier
Flight AIRCRAFT Aircraft
Flight DELAYED_BY Reason
Flight CANCELLED_BY Reason
Airport IN_CITY City
City IN_STATE State
Flight DIVERTED_TO Airport
Which cities have more than 1 airport?
Query
MATCH (:Airport)-[:IN_CITY]->(c:City)
WITH c, COUNT(*) AS Count
WHERE Count > 1
RETURN c.name AS City, Count
ORDER BY Count
Result
City | Count |
---|---|
WASHINGTON, DC | 2 |
HOUSTON, TX | 2 |
NEW YORK, NY | 2 |
CHICAGO, IL | 2 |
Which cities have more than 1 airport, and what are they?
Query
MATCH (a:Airport)-[:IN_CITY]->(c:City)
WITH c.name AS City, COLLECT(a.name) AS Airports
WHERE LENGTH(Airports) > 1
RETURN City, Airports
ORDER BY LENGTH(Airports)
Result
City | Airports |
---|---|
HOUSTON, TX | GEORGE BUSH INTERCONTINENTAL/HOUSTON, WILLIAM P HOBBY |
WASHINGTON, DC | RONALD REAGAN WASHINGTON NATIONAL, WASHINGTON DULLES INTERNATIONAL |
NEW YORK, NY | JOHN F. KENNEDY INTERNATIONAL, LAGUARDIA |
CHICAGO, IL | CHICAGO O'HARE INTERNATIONAL, CHICAGO MIDWAY INTERNATIONAL |
Of the flights that departed from SFO and were diverted, what are the top airports that they were diverted to?
Query
MATCH (f:Flight)-[:ORIGIN]->(:Airport {abbr:'SFO'}),
(f)-[d:DIVERTED_TO]->(a:Airport)
WHERE d.step = 1
RETURN a.abbr + ' - ' + a.name AS Airport, COUNT(*) AS Count
ORDER BY Count DESC
LIMIT 10
Result
Airport | Count |
---|---|
SFO - SAN FRANCISCO INTERNATIONAL | 13 |
TWF - JOSLIN FIELD - MAGIC VALLEY REGIONAL | 6 |
RDM - ROBERTS FIELD | 5 |
GJT - GRAND JUNCTION REGIONAL | 3 |
RDD - REDDING MUNICIPAL | 3 |
EUG - MAHLON SWEET FIELD | 2 |
DTW - DETROIT METRO WAYNE COUNTY | 2 |
LAX - LOS ANGELES INTERNATIONAL | 2 |
BOS - LOGAN INTERNATIONAL | 2 |
LMT - KLAMATH FALLS AIRPORT | 2 |
Fleets
What is related, and how?
Airline BASED_IN Country
Series MODEL Model
Aircraft SERIES Series
Aircraft OWNED_BY Airline
Which airlines own the most active Boeing 747s? How many do they own? In which countries are these airlines based?
Query
MATCH (a:Airline)-[:BASED_IN]->(c:Country),
(craft:Aircraft)-[:OWNED_BY]->(a),
(craft)-[:SERIES]->(:Series)-[:MODEL]->(m:Model)
WHERE craft.status = 'Active' AND m.name = 'Boeing 747'
WITH a, COUNT(m) AS count, c
ORDER BY count DESC
RETURN a.name AS Airline, c.name AS Country, count AS `Number of Boeing 747 Owned`
LIMIT 10
Result
Airline | Country | Number of Boeing 747 Owned |
---|---|---|
British Airways | United Kingdom | 48 |
Korean Air | South Korea | 36 |
Cathay Pacific | Hong-Kong | 33 |
Lufthansa | Germany | 32 |
China Airlines | Taiwan | 31 |
KLM | Netherlands | 26 |
United Airlines | USA | 24 |
Atlas Air | USA | 22 |
Transaero Airlines | Russia | 20 |
Cargolux | Luxembourg | 19 |
Which Embraer models have the most series of planes? What are they?
Query
MATCH (s:Series)-[:MODEL]->(m:Model)
WHERE m.name =~ 'Embraer.*'
WITH m.name AS Model, COLLECT(s.name) AS Series
RETURN Model, Series
ORDER BY LENGTH(Series) DESC
Result
Model | Series |
---|---|
Embraer 135/145 | 145SM, 145RS, 145EW, 145MP, 145BJ, 135ER, 135EJ, 135LR, 135, 145MR, 145LI, 145XR, 145ER, 145EP, 145LR, 145LU, 145SA, 135BK, 145EU, 145EJ, 135BJ, 140LR |
Embraer 190/195 | 195STD, 190SLR, 190ECJ, 190BJ, 195LR, 190IGW, 195AR, 195SR, 190ER, 190, 195IGW, 190AR, 190LR, 190STD, 190SR |
Embraer 170/175 | 175SR, 175SU, 170SL, 170SE, 170STD, 175SD, 175LR, 170LR, 170SU, 175STD, 170ST, 175IGW, 170AR, 170SR |
Embraer 120 Brasilia | 120ER, 120QC, 120, 120RT |
Dallas/Forth Worth International Airport
What is related, and how?
Gate IN_TERMINAL Terminal
Place IN_CATEGORY Category
Place AT_GATE Gate
At which gates in terminal A can I find a Starbucks?
Query
MATCH (p:Place)-[:AT_GATE]->(g:Gate),
(g)-[:IN_TERMINAL]->(t:Terminal)
WHERE p.name = 'Starbucks Coffee' AND t.name = 'A'
RETURN g.gate AS Gate, t.name AS Terminal
ORDER BY Gate
Result
Gate | Terminal |
---|---|
29 | A |
33 | A |
37 | A |
Where are places in terminal D in categories 'Bar' or 'Barbecue'?
Query
MATCH (p:Place)-[:IN_CATEGORY]->(c:Category)
WHERE c.name IN ['Bar', 'Barbecue']
WITH p
MATCH (p)-[:AT_GATE]->(g:Gate),
(g)-[:IN_TERMINAL]->(t:Terminal)
WHERE t.name = 'D'
RETURN p.name AS Place, g.gate AS Gate
ORDER BY Gate
Result
Place | Gate |
---|---|
Railhead BBQ | 18 |
Tigin Irish Pub & Restaurant | 20 |
Cousin's Bar-B-Q | 28 |
Gas Monkey Bar & Grill | 31 |
Caltrain
What is related, and how?
Stop IN Zone
Stop NEXT Stop
Train STOPS_AT Stop
Which consecutive stops are furthest apart?
Query
MATCH (s1:Stop)-[r:NEXT]->(s2:Stop)
WITH s1, s2, r.distance AS Distance
ORDER BY Distance DESC
LIMIT 1
RETURN s1.name AS Stop1, s2.name AS Stop2, Distance
Result
Stop1 | Stop2 | Distance |
---|---|---|
BLOSSOM HILL | MORGAN HILL | 11.8 |
Which trains can I take home to San Mateo from San Francisco after the hackathon tonight?
Query
MATCH (t:Train)
WHERE t.direction = 'Southbound' AND t.type = 'Weekday'
WITH t
MATCH (t)-[leave:STOPS_AT]->(begin:Stop),
(t)-[:STOPS_AT]->(end:Stop)
WHERE begin.name = 'SAN FRANCISCO' AND
end.name = 'SAN MATEO'
WITH t, leave.hour*100 + leave.minute AS minutes, leave.hour_s AS Hour, leave.minute_s AS Minute
WHERE minutes > 2100 OR minutes < 100
RETURN t.id AS Train, Hour + ':' + Minute AS `Departure Time`
Result
Train | Departure Time |
---|---|
194 | 21:40 |
196 | 22:40 |
198 | 00:01 |
Music
Last.fm
What is related, and how?
Scrobble PLAYED Track
Scrobble NEXT Scrobble
User SCROBBLED Scrobble
Artist MEMBER_OF Genre
Track SUNG_BY Artist
Which artists do users 'nmwhite0131' and 'smooligans' mutually listen to most often?
Query
MATCH (u1:User {username:'nmwhite0131'}), (u2:User {username:'smooligans'})
WITH u1, u2
MATCH (u1)-[:SCROBBLED]->(:Scrobble)-[:PLAYED]->(:Track)-[:SUNG_BY]->(a:Artist),
(u2)-[:SCROBBLED]->(:Scrobble)-[:PLAYED]->(:Track)-[:SUNG_BY]->(a)
RETURN a.name AS Artist, COUNT(*) AS Count
ORDER BY Count DESC
LIMIT 10
Result
Artist | Count |
---|---|
Drake | 15019 |
Beyoncé | 6869 |
Lana Del Rey | 3723 |
Jay-Z | 2967 |
Nicki Minaj | 2624 |
Lil' Wayne | 2242 |
Rihanna | 1911 |
Daft Punk | 1560 |
Explosions in the Sky | 1165 |
Tyga | 663 |
User nmwhite0131's last five scrobbles.
Query
MATCH (:User {username:'nmwhite0131'})-[:SCROBBLED]->(last:Scrobble),
recent = (:Scrobble)-[:NEXT*4]->(last)
WHERE NOT ((last)-[:NEXT]->(:Scrobble))
WITH NODES(recent) AS scrobbles
UNWIND scrobbles AS s
MATCH (s)-[:PLAYED]->(t:Track)-[:SUNG_BY]->(a:Artist)
RETURN s.date AS Date, t.name AS Track, a.name AS Artist
Result
Date | Track | Artist |
---|---|---|
2014-09-24 | Easy | Saycet |
2014-09-24 | Bruyère | Saycet |
2014-09-24 | Opal | Saycet |
2014-09-24 | Her Movie | Saycet |
2014-09-24 | We Walk Fast | Saycet |
Social
What is related, and how?
Tweet USING Source
Tweet CONTAINS Link
Tweet MENTIONS User
Tweet RETWEETS Tweet
User POSTS Tweet
Tweet REPLY_TO Tweet
Hashtag TAGS Tweet
How are hashtags 'java' and 'graph' related?
Query
MATCH (java:Hashtag {name:'java'}), (graph:Hashtag {name:'graph'})
WITH java, graph
MATCH p = shortestPath((java)-[:TAGS*]-(graph))
RETURN p
Result
What is the most retweeted tweet, who posted it, and what did it say?
Query
MATCH (:Tweet)-[:RETWEETS]->(t:Tweet)
WITH t, COUNT(*) AS count
ORDER BY count DESC
LIMIT 1
MATCH (u:User)-[:POSTS]->(t)
RETURN u.screen_name AS User, t.text AS Tweet, count AS Retweets
Result
User | Tweet | Retweets |
---|---|---|
println_ts | A twist on how to do login & registration form on #Neo4J's website. #ux #ui http://t.co/CmDIfkSDT3 http://t.co/fjtSik59Qa | 5 |
I wanted to replicate this on my local machine. Where is the dataset? Where are the import queries?