Skip to content

Instantly share code, notes, and snippets.

@jexp
Forked from rvanbruggen/importknot.cql
Created February 6, 2014 19:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jexp/8851179 to your computer and use it in GitHub Desktop.
Save jexp/8851179 to your computer and use it in GitHub Desktop.
= Untying the Graph Database Import Knot =
This Graphgist accompanies http://blog.bruggen.com/2013/12/untying-graph-database-import-knot.html[my blogpost of December 6th, 2013] which tries to explain the different types of questions that people should be asking themselves when thinking about importing data into http://www.neo4j.org[neo4j], and the tools that can contribute to finding the most optimal import strategy for your specific use case.
//setup
//hide
[source,cypher]
----
create (n1:Tool {id:'1',name:'Spreadsheets'}),
(n2:Tool {id:'2',name:'Cypher Statements'}),
(n3:Tool {id:'3',name:'neo4j-shell-tools'}),
(n4:Tool {id:'4',name:'batch importer'}),
(n5:Tool {id:'5',name:'Talend'}),
(n6:Tool {id:'6',name:'Mulesoft'}),
(n7:Tool {id:'7',name:'Java API'}),
(n8:Tool {id:'8',name:'REST API'}),
(n9:Tool {id:'9',name:'Spring Data Neo4j'}),
(n10:ToolCategory {id:'10',name:'Spreadsheets'}),
(n11:ToolCategory {id:'11',name:'Neo4j-shell'}),
(n12:ToolCategory {id:'12',name:'Command-line'}),
(n13:ToolCategory {id:'13',name:'ETL-tools'}),
(n14:ToolCategory {id:'14',name:'Custom Software'}),
(n100:QuestionCategory {id:'100',name:'Project'}),
(n110:QuestionCategory {id:'110',name:'Size'}),
(n120:QuestionCategory {id:'120',name:'Format'}),
(n130:QuestionCategory {id:'130',name:'Type'}),
(n101:ProjectQuestion {id:'101',name:'What exists already?'}),
(n102:ProjectQuestion {id:'102',name:'How much time do you have?'}),
(n103:ProjectQuestion {id:'103',name:'What skills do you have inhouse?'}),
(n1011:ProjectAnswer {id:'1011',name:'Greenfield'}),
(n1012:ProjectAnswer {id:'1012',name:'Brownfield'}),
(n1021:ProjectAnswer {id:'1021',name:'Time is not as important'}),
(n1022:ProjectAnswer {id:'1022',name:'Time is of the essence'}),
(n1031:ProjectAnswer {id:'1031',name:'Java Development'}),
(n1032:ProjectAnswer {id:'1032',name:'Other Development'}),
(n1033:ProjectAnswer {id:'1033',name:'DBA/Analyst'}),
(n1101:SizeAnswer {id:'1101',name:'1000s'}),
(n1102:SizeAnswer {id:'1102',name:'100000s'}),
(n1103:SizeAnswer {id:'1103',name:'1000000s'}),
(n1201:FormatAnswer {id:'1201',name:'Database'}),
(n1202:FormatAnswer {id:'1202',name:'File-CSV'}),
(n1203:FormatAnswer {id:'1203',name:'File-Spreadsheet'}),
(n1204:FormatAnswer {id:'1204',name:'File-Geoff'}),
(n1205:FormatAnswer {id:'1205',name:'File-GraphML'}),
(n1206:FormatAnswer {id:'1206',name:'Service'}),
(n1301:TypeAnswer {id:'1301',name:'Bulk Load'}),
(n1302:TypeAnswer {id:'1302',name:'Incremental Load'}),
(n1303:TypeAnswer {id:'1303',name:'Bulk Load + Incremental Load'}),
n1-[:BELONGS_TO]->n10,
n2-[:BELONGS_TO]->n11,
n3-[:BELONGS_TO]->n11,
n4-[:BELONGS_TO]->n12,
n5-[:BELONGS_TO]->n13,
n6-[:BELONGS_TO]->n13,
n7-[:BELONGS_TO]->n14,
n8-[:BELONGS_TO]->n14,
n9-[:BELONGS_TO]->n14,
n101-[:BELONGS_TO]->n100,
n102-[:BELONGS_TO]->n100,
n103-[:BELONGS_TO]->n100,
n1011-[:BELONGS_TO]->n101,
n1012-[:BELONGS_TO]->n101,
n1021-[:BELONGS_TO]->n102,
n1022-[:BELONGS_TO]->n102,
n1031-[:BELONGS_TO]->n103,
n1032-[:BELONGS_TO]->n103,
n1033-[:BELONGS_TO]->n103,
n1101-[:BELONGS_TO]->n110,
n1102-[:BELONGS_TO]->n110,
n1103-[:BELONGS_TO]->n110,
n1201-[:BELONGS_TO]->n120,
n1202-[:BELONGS_TO]->n120,
n1203-[:BELONGS_TO]->n120,
n1204-[:BELONGS_TO]->n120,
n1205-[:BELONGS_TO]->n120,
n1206-[:BELONGS_TO]->n120,
n1301-[:BELONGS_TO]->n130,
n1302-[:BELONGS_TO]->n130,
n1303-[:BELONGS_TO]->n130,
n14-[:SUITED_FOR]->n1011,
n4-[:SUITED_FOR]->n1022,
n1-[:SUITED_FOR]->n1101,
n1-[:SUITED_FOR]->n1102,
n3-[:SUITED_FOR]->n1103,
n12-[:SUITED_FOR]->n1103,
n13-[:SUITED_FOR]->n1201,
n14-[:SUITED_FOR]->n1201,
n4-[:SUITED_FOR]->n1202,
n3-[:SUITED_FOR]->n1202,
n3-[:SUITED_FOR]->n1204,
n3-[:SUITED_FOR]->n1205,
n14-[:SUITED_FOR]->n1206,
n3-[:SUITED_FOR]->n1301,
n4-[:SUITED_FOR]->n1301,
n14-[:SUITED_FOR]->n1302,
n14-[:SUITED_FOR]->n1303;
----
//graph
So let's look at some of the `questions` that you should ask yourself when preparing for an import:
[source,cypher]
----
MATCH (n:QuestionCategory)<-[r:BELONGS_TO]-(m) return m.name as Question,type(r),n.name as QuestionCategory;
----
//table
And then look at some of the `tools` that at your disposal:
[source, cypher]
----
MATCH (k:ToolCategory)<-[s:BELONGS_TO]-(l) return l.name as Tool,type(s),k.name as ToolCategory;
----
//table
Let's finally take a look `which tools are suited for which import use cases`:
[source, cypher]
----
MATCH (f:Tool)-[h:SUITED_FOR]->(g) return f.name as Tool,type(h),g.name as ImportUseCase limit 50;
----
//table
I hope this was a useful addition to the blog post - hope you enjoyed.
Rik
//console
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment