Skip to content

Instantly share code, notes, and snippets.

@totetmatt
Last active August 29, 2015 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save totetmatt/8867551 to your computer and use it in GitHub Desktop.
Save totetmatt/8867551 to your computer and use it in GitHub Desktop.
:neo4j-version: 2.0.0
:author: Matthieu Totet
:twitter: @totetmatt
:tags: Oreilly, Books, Media
= Oreilly book graph
== Dat Model
Data scrapped from Oreilly Web site : http://www.oreilly.com/
----
[:Book]-[:AUTHOR]->[:Author]
[:Book]-[:CATEGORY]->[:Category]
[:Category]-[:HAS_SUBJECT]->[:Subject]
[:Book]-[:MEDIA {price}]->[:MediaType]
----
== Dat Data
++++
<img src="http://cdn.memegenerator.net/instances/500x/45891161.jpg"/>
++++
I didn't manage to make it works on GraphGist...
So you have to download this file : http://matthieu-totet.fr/Neoreilly.zip. It's a neo4J database with all the data already loaded.
Unzip it and run the bin/neo4J script to start the DB.
Go to http://localhost:7474/webadmin and copy paste the queries below to test. (http://localhost:7474/browser/ won't work because of the 1000 result limit)
You can also find in the zip the original oreilly.geoff file that I use to import data into Neo4J.
== Query Time !
=== Get the average price & number of media, grouped by Subject
[source,cypher]
----
MATCH (s)<-[:HAS_SUBJECT]-(c)<-[l:CATEGORY]-(n)-[r:MEDIA]->(m)
return AVG(r.price),count(*),s.name
ORDER BY AVG(r.price) DESC
----
=== Get average price & number of media , grouped by Category and MediaType
[source,cypher]
----
MATCH (c)<-[l:CATEGORY]-(n)-[r:MEDIA]->(m)
return AVG(r.price),count(*),m.name,c.name
ORDER BY c.name,AVG(r.price) DESC
----
=== Get number of author per media
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-(m)
return count(*),m.name
ORDER BY count(*) DESC
----
=== Get number of media per author
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-(m)
return count(*),a1.name
ORDER BY count(*) DESC
----
=== Get all authors and all the collaborations
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-()-[:AUTHOR]->(a2)
WHERE a1 <> a2
return a1.name,collect(a2.name)
----
=== Get people that works with "David Pogue"
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-()-[:AUTHOR]->(a2)
WHERE a1 <> a2 and a1.name ="David Pogue"
return a2.name
----
=== ... and see in which category they works together
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-(m)-[:AUTHOR]->(a2),(m)-[:CATEGORY]->(c)
WHERE a1 <> a2 and a1.name ="David Pogue"
return a2.name,c.name
----
=== Get the average price per media type for all Authors
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-(m)-[t:MEDIA]->(ty)
return a1.name,avg(t.price),ty.name
ORDER BY avg(t.price) DESC
----
=== Get Books that are mediaType Video OR a non-Ebook under 50€ about Certification and with at least 2 authors.
[source,cypher]
----
MATCH (a1)<-[:AUTHOR]-(m)-[t:MEDIA]->(ty),(s)<-[:HAS_SUBJECT]-(c)<-[l:CATEGORY]-(m)
WITH count(distinct a1) as nbAuthor, m as m,t as t, ty as ty, c as c, s as s
WHERE ( ty.name="Video" OR ( t.price < 50 AND ty.name<> "Ebook" AND ty.name<>"PrintandEbook") )
AND s.name="Certification"
AND nbAuthor >= 2
return distinct m.name,t.price,ty.name, nbAuthor
----
=== do your own query
[source]
----
(you)-[:RELEASE]->(creativity)
----
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment