Jim-Salmons/cg-meta-graph.adoc

## cg-meta-graph.adoc

      
    Raw
  

              cg-meta-graph.adoc
            
          
    A Simple Meta-Data Model for a Graph Database (Ver. 1.1)


Original by: Curt Gardner
v1.0, 30-Sep-2013


Forked by: Jim Salmons
v1.1, 06-Dec-2013 - Same basic document but leveraging Neo4j 2.x labels and tweaking queries accordingly. The most dramatic result of this is the elimination of the need for the explicit 'Admin' elements and all things related to the 'owns' relationship.


Note


Curt’s explanatory comments and diagrams from the 1.0 version of this GG do NOT sync with the refactoring I’ve implemented here. You can mentally replace Curt’s explicit 'ownership-related sets of nodes' idiom with subsets based on 2.0’s new label feature. That’s what you’ll see in the tweaked queries. The most visible result of using this important new feature in Neo4j is that the queries from Curt’s original GG produce results that are ONLY about the structure of the data to be contained in the "regular" database. There are no more 'Admin' nodes nor 'owns' relationships that are really elements specific to the embedded metamodel subgraph of a 'self-descriptive' Neo4j database.


Setting up a Meta-Data Framework


This GraphGist is a quick exploration of a simple meta-data administration which can
be used to store the structure of the nodes and relationships in a graph database like neo4j.


The data that is set up here could be used in an application layer to provide tailored
UI and validation, or could be used simply with Cypher queries as a kind of meta-data
dictionary.  One may argue that this is pushing too much structure into a graph database,
but I think the concept is worth exploring.  How well can you query a graph database if
you’re not sure of the structure of the data?


This is a fairly quick stab at this, and it could certainly be taken much further to
include more about properties, security aspects and much more.  Mostly it was a learning exercise for me,
and hopefully will generate some thoughtful criticism.  Hopefully I haven’t made too many egregious
mistakes!


The basic concept is that each node in the graph will have a NodeType, and that NodeType
will be represented itself as an 'admin' node.  Likewise each relationship will have a RelType and
that RelType will also be represented as an 'admin' Node.  Then we can further identify for
each RelType what types of nodes it can be used with for both the Start and End.  Conceptually
the 'admin' model looks like this:


To actually realize the concept, I came up with the following model of nodes and relationships.  (Note that in
the diagrams, the node name is shown first, with the node’s NodeType below in square brackets).


Setting up new Meta-Data for an application


Once the admin infrastructure is in place, setup for any new desired NodeTypes and
RelTypes can be done.  In this example, assume we will have Person and Date nodes,
and will need to be able to create relationships to support capturing a Date hierarchy,
marriage ties, and birthdates.


The necessary setup will involve the creation of two new AdminNodeTypes (Person and
Date), three new AdminRelTypes (DateIn, Spouse, and Birthdate), and the relationships
necessary to link them together:


The Node Type Owner Owns Person and Date


The Rel Type Owner Owns DateIn, Spouse, and Birthdate


For DateIn, the StartNodeType is Date, and the EndNodeType is Date


For Spouse, the StartNodeType is Person, and the EndNodeType is Person


For Birthdate, the StartNodeType is Person, and the EndNodeType is Date


The result looks like this:


//All data Admin setup

// CREATE (ntOwner:META:MODEL:DEPRECATED {name:'Node Type Owner', descr:'Owns all Node Types'})
// CREATE (rtOwner:META:MODEL:DEPRECATED {name:'Rel Type Owner', descr:'Owns all Rel Types'})
// CREATE (admin:META:MODEL:DEPRECATED   {name:'Admin'})

// I think these will be unnecessary
// CREATE (adminNT:META:MODEL:NODE {name:'NodeType'})
// CREATE (adminRT:META:MODEL:NODE {name:'RelType'})

// TO BE DEPRECATED BY USING LABELED SETS
// CREATE (owns:META:MODEL:RELATIONSHIP    {name:'Owns', descr:'Owns'})
// CREATE (startNT:META:MODEL:RELATIONSHIP {name:'StartNodeType'})
// CREATE (endNT:META:MODEL:RELATIONSHIP   {name:'EndNodeType'})

// TO BE DEPRECATED BY USING LABELED SETS
// CREATE rtOwner-[:Owns]->owns
// CREATE rtOwner-[:Owns]->startNT
// CREATE rtOwner-[:Owns]->endNT
// CREATE ntOwner-[:Owns]->admin
// CREATE ntOwner-[:Owns]->adminNT
// CREATE ntOwner-[:Owns]->adminRT

// CREATE owns-[:StartNodeType]->admin
// CREATE owns-[:EndNodeType]->admin
// CREATE owns-[:EndNodeType]->adminNT
// CREATE owns-[:EndNodeType]->adminRT

// CREATE startNT-[:StartNodeType]->adminRT
// CREATE startNT-[:EndNodeType]->adminNT
// CREATE endNT-[:StartNodeType]->adminRT
// CREATE endNT-[:EndNodeType]->adminNT

// This is what we're really after...
//
CREATE (person:META:MODEL:NODE {name:'Person'})
CREATE (date:META:MODEL:NODE   {name:'Date'})

// TO BE DEPRECATED BY USING LABELED SETS
// CREATE ntOwner-[:Owns]->person
// CREATE ntOwner-[:Owns]->date

CREATE (spouse:META:MODEL:RELATIONSHIP    {name:'Spouse'})
CREATE (dateIn:META:MODEL:RELATIONSHIP    {name:'DateIn'})
CREATE (birthdate:META:MODEL:RELATIONSHIP {name:'Birthdate'})

// TO BE DEPRECATED BY USING LABELED SETS
// CREATE rtOwner-[:Owns]->spouse
// CREATE rtOwner-[:Owns]->dateIn
// CREATE rtOwner-[:Owns]->birthdate

CREATE spouse-[:StartNodeType]->person
CREATE spouse-[:EndNodeType]->person
CREATE dateIn-[:StartNodeType]->date
CREATE dateIn-[:EndNodeType]->date
CREATE birthdate-[:StartNodeType]->person
CREATE birthdate-[:EndNodeType]->date


Now some sample queries using this data


Here’s a console for queries:


Get all valid NodeTypes


MATCH (n:META:MODEL:NODE)
RETURN n.name AS NodeType
ORDER BY n.name


Get valid RelTypes for each NodeType


MATCH (r:META:MODEL:RELATIONSHIP)-[:StartNodeType]->n
RETURN n.name AS NodeType, collect(r.name) AS RelTypes
ORDER BY n.name


Get valid Start NodeTypes for each RelType


MATCH (r:META:MODEL:RELATIONSHIP)-[:StartNodeType]->n
RETURN r.name AS RelType, collect(n.name) AS StartNodeTypes
ORDER BY r.name


Get valid End NodeTypes for each RelType


MATCH (r:META:MODEL:RELATIONSHIP)-[:EndNodeType]->n
RETURN r.name AS RelType, collect(n.name) AS EndNodeTypes
ORDER BY r.name


I did not explicitly connect each node to its NodeType via a Relationship, rather its just an implicit tie using the 'type' property on the node.  Not sure if there would be benefit to using a relationship…


Variations of these queries can be used in the validation of Nodes and particularly Relationships to ensure that they are playing by the rules!  I’ve built a simple version of a generic UI (html/javascript) for nodes and relationships using PHP for all database access and validation.


End Curt’s Original GG


ADDED: List Relationship Constraints in the Metamodel


Note


Let’s add an altDate type node so the DateIn relationship can demonstrate more than one node type on its start and end points…


MATCH (d:META:MODEL:RELATIONSHIP  {name:'DateIn'})
CREATE (altDate:META:MODEL:NODE   {name:'AltDate'})
CREATE d-[:StartNodeType]->altDate
CREATE d-[:EndNodeType]->altDate


And now let’s look at a list of what Relationships are defined in our Metamodel and to which Nodes these Relationships can connect…


MATCH (nStart)<-[:StartNodeType]-(r:META:MODEL:RELATIONSHIP)-[:EndNodeType]->nEnd
RETURN collect(DISTINCT nStart.name) AS `From Node`, r.name AS Relationship, collect(DISTINCT nEnd.name) AS `To Node`
ORDER BY r.name


Note


The StartNodeType and EndNodeType relationships do not show up here even though they are contained in the overall database. This is because these relationships exist for expressing realtionships between nodes WITHIN the metamodel, not within the "regular" data of the self-describing database.


Note


We’ll explore the ideas Curt started exploring here in a follow-up GraphGist to be submitted as part of the Dec-Jan Domain Model GraphGist Challenge. In this follow-up GraphGist we’ll be exploring a use case related to the FactMiners social-game ecosystem which is part of The Softalk Apple Project (www.SoftalkApple.com).