Skip to content

Instantly share code, notes, and snippets.

@rvanbruggen
Last active August 29, 2015 13:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save rvanbruggen/9909328 to your computer and use it in GitHub Desktop.
Save rvanbruggen/9909328 to your computer and use it in GitHub Desktop.
= (Product) Hierarchy GraphGist =
This gist is a complement to http://blog.bruggen.com/2014/03/using-neo4j-to-manage-and-calculate.html[blogpost that I wrote] about managing hierarchical data structures in http://www.neo4j.org[neo4j]..
In this example, we are using a "product hierarchy", essentially holding information about the composition of a product (what is it made of, how many of the components are used, and at the lowest level, what is the price of these components). The model looks like this:
image::http://1.bp.blogspot.com/-XIjEXWHpNmc/Uzbhuoo-9xI/AAAAAAABNWE/7zYyn3Vl3i0/s3200/Screen+Shot+2014-03-29+at+16.04.35.png[]
Note that in the graphgist, I have cut the tree depth to 5 levels (product to costs) instead of 6 in the blogpost - and that I also reduced the width of the tree to make it manageable in a gist.
== Loading some data: a 5-level tree ==
First we have to load the data into the graph. This was a bit of work - but not difficult at all:
.Creating the top of the tree, the Product (just one in this case):
[source,cypher]
----
create (n1:PRODUCT {id:1});
----
.Then create the Cost Groups:
[source,cypher]
----
match (n1:PRODUCT) foreach (r in range(1,3) | create (n2:COST_GROUP {id:r})-[:PART_OF {quantity:round(rand()*100)}]->(n1) );
----
.Then add the Cost Types to the Cost Groups:
[source,cypher]
----
match (n2:COST_GROUP) foreach (r in range(1,5) | create (n3:COST_TYPE {id:r})-[:PART_OF {quantity:round(rand()*100)}]->(n2) );
----
.Then add the Cost Subtypes to the Cost Types:
[source,cypher]
----
match (n3:COST_TYPE) foreach (r in range(1,3) | create (n4:COST_SUBTYPE {id:r})-[:PART_OF {quantity:round(rand()*100)}]->(n3) );
----
.Then finally add the Costs to the Cost Subtypes:
[source,cypher]
----
match (n4:COST_SUBTYPE) foreach (r in range(1,5) | create (n5:COST {id:r,price:round(rand()*1000)})-[:PART_OF {quantity:round(rand()*100)}]->(n4) );
----
The actual graph then looks like this:
//graph
== Querying the hierarchy structure ==
Then we can do some easy queries. Let's check the structure of the hierarchy and the number of nodes:
[source,cypher]
----
match (n) return labels(n) as KindsOfNodes, count(n) as NrOfNodes;
----
This is what it looks like:
//table
Now let's start manipulating the graph and do some interesting stuff. Let's calculate the price of the product at the top of this hierarchy, by sweeping through the graph and mutiplying price with the quantities on each ot the relationships.
[source,cypher]
----
//calculating price based on full sweep of the tree
match (n1:PRODUCT {id:1})<-[r1]-(:COST_GROUP)<-[r2]-(:COST_TYPE)<-[r3]-(:COST_SUBTYPE)<-[r4]-(n5:COST)
return sum(r1.quantity*r2.quantity*r3.quantity*r4.quantity*n5.price) as PriceOfProduct;
----
//table
== Optimising the calculation with intermediate price values at every level ==
But maybe we can do that more efficiently, by calculating intermediate prices for each of the levels in the hierarchy:
[source, cypher]
----
//calculate intermediate pricing
match (n4:COST_SUBTYPE)<-[r4]-(n5:COST)
with n4,sum(r4.quantity*n5.price) as Sum
set n4.price=Sum;
----
[source, cypher]
----
match (n3:COST_TYPE)<-[r3]-(n4:COST_SUBTYPE)
with n3,sum(r3.quantity*n4.price) as Sum
set n3.price=Sum;
----
[source, cypher]
----
match (n2:COST_GROUP)<-[r2]-(n3:COST_TYPE)
with n2,sum(r2.quantity*n3.price) as Sum
set n2.price=Sum;
----
[source, cypher]
----
match (n1:PRODUCT)<-[r1]-(n2:COST_GROUP)
with n1, sum(r1.quantity*n2.price) as Sum
set n1.price=Sum
return Sum;
----
//table
Then we can easily calculate the price of the product by just using the intermediate pricing, and scanning a MUCH smaller part of the graph:
[source, cypher]
----
match (n1:PRODUCT {id:1})<-[r1]-(n2:COST_GROUP)
return sum(r1.quantity*n2.price) as PriceOfProduct;
----
//table
We can check the accuracy by looking at a different level and verifying if we get the same result:
[source, cypher]
----
match (n1:PRODUCT {id:1})<-[r1]-(n2:COST_GROUP)<-[r2]-(n3:COST_TYPE)
return sum(r1.quantity*r2.quantity*n3.price) as PriceOfProduct;
----
//table
Yey! That seems to have confirmed the theory!
== What if something changes to the hierarchy? ==
Now let's see what happens if we change something to the price of one of the costs at the bottom of the tree:
[source,cypher]
----
match (n5:COST)
with n5, n5.price as OLDPRICE limit 1
set n5.price = n5.price*10
with n5.price-OLDPRICE as PRICEDIFF,n5
match (n5)-[r4:PART_OF]->(n4:COST_SUBTYPE)-[r3:PART_OF]->(n3:COST_TYPE)-[r2:PART_OF]->(n2:COST_GROUP)-[r1:PART_OF]-(n1:PRODUCT)
set n4.price=n4.price+(PRICEDIFF*r4.quantity),
n3.price=n3.price+(PRICEDIFF*r4.quantity*r3.quantity),
n2.price=n2.price+(PRICEDIFF*r4.quantity*r3.quantity*r2.quantity),
n1.price=n1.price+(PRICEDIFF*r4.quantity*r3.quantity*r2.quantity*r1.quantity)
return PRICEDIFF as PriceDifference, n1.price as NewPriceOfProduct;
----
//table
Then we can also go back and replay the queries above and see what has happened in the console below:
//console
== Conclusion ==
I hope this gist complements the blogpost and gives you some ideas around how to work with any kind of hierarchy using neo4j.
This gist was created by link:mailto:rik@neotechnology.com[Rik Van Bruggen]
* link:http://blog.bruggen.com[My Blog]
* link:http://twitter.com/rvanbruggen[On Twitter]
* link:http://be.linkedin.com/in/rikvanbruggen/[On LinkedIn]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment