-
-
Save jexp/8851179 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
= Untying the Graph Database Import Knot = | |
This Graphgist accompanies http://blog.bruggen.com/2013/12/untying-graph-database-import-knot.html[my blogpost of December 6th, 2013] which tries to explain the different types of questions that people should be asking themselves when thinking about importing data into http://www.neo4j.org[neo4j], and the tools that can contribute to finding the most optimal import strategy for your specific use case. | |
//setup | |
//hide | |
[source,cypher] | |
---- | |
create (n1:Tool {id:'1',name:'Spreadsheets'}), | |
(n2:Tool {id:'2',name:'Cypher Statements'}), | |
(n3:Tool {id:'3',name:'neo4j-shell-tools'}), | |
(n4:Tool {id:'4',name:'batch importer'}), | |
(n5:Tool {id:'5',name:'Talend'}), | |
(n6:Tool {id:'6',name:'Mulesoft'}), | |
(n7:Tool {id:'7',name:'Java API'}), | |
(n8:Tool {id:'8',name:'REST API'}), | |
(n9:Tool {id:'9',name:'Spring Data Neo4j'}), | |
(n10:ToolCategory {id:'10',name:'Spreadsheets'}), | |
(n11:ToolCategory {id:'11',name:'Neo4j-shell'}), | |
(n12:ToolCategory {id:'12',name:'Command-line'}), | |
(n13:ToolCategory {id:'13',name:'ETL-tools'}), | |
(n14:ToolCategory {id:'14',name:'Custom Software'}), | |
(n100:QuestionCategory {id:'100',name:'Project'}), | |
(n110:QuestionCategory {id:'110',name:'Size'}), | |
(n120:QuestionCategory {id:'120',name:'Format'}), | |
(n130:QuestionCategory {id:'130',name:'Type'}), | |
(n101:ProjectQuestion {id:'101',name:'What exists already?'}), | |
(n102:ProjectQuestion {id:'102',name:'How much time do you have?'}), | |
(n103:ProjectQuestion {id:'103',name:'What skills do you have inhouse?'}), | |
(n1011:ProjectAnswer {id:'1011',name:'Greenfield'}), | |
(n1012:ProjectAnswer {id:'1012',name:'Brownfield'}), | |
(n1021:ProjectAnswer {id:'1021',name:'Time is not as important'}), | |
(n1022:ProjectAnswer {id:'1022',name:'Time is of the essence'}), | |
(n1031:ProjectAnswer {id:'1031',name:'Java Development'}), | |
(n1032:ProjectAnswer {id:'1032',name:'Other Development'}), | |
(n1033:ProjectAnswer {id:'1033',name:'DBA/Analyst'}), | |
(n1101:SizeAnswer {id:'1101',name:'1000s'}), | |
(n1102:SizeAnswer {id:'1102',name:'100000s'}), | |
(n1103:SizeAnswer {id:'1103',name:'1000000s'}), | |
(n1201:FormatAnswer {id:'1201',name:'Database'}), | |
(n1202:FormatAnswer {id:'1202',name:'File-CSV'}), | |
(n1203:FormatAnswer {id:'1203',name:'File-Spreadsheet'}), | |
(n1204:FormatAnswer {id:'1204',name:'File-Geoff'}), | |
(n1205:FormatAnswer {id:'1205',name:'File-GraphML'}), | |
(n1206:FormatAnswer {id:'1206',name:'Service'}), | |
(n1301:TypeAnswer {id:'1301',name:'Bulk Load'}), | |
(n1302:TypeAnswer {id:'1302',name:'Incremental Load'}), | |
(n1303:TypeAnswer {id:'1303',name:'Bulk Load + Incremental Load'}), | |
n1-[:BELONGS_TO]->n10, | |
n2-[:BELONGS_TO]->n11, | |
n3-[:BELONGS_TO]->n11, | |
n4-[:BELONGS_TO]->n12, | |
n5-[:BELONGS_TO]->n13, | |
n6-[:BELONGS_TO]->n13, | |
n7-[:BELONGS_TO]->n14, | |
n8-[:BELONGS_TO]->n14, | |
n9-[:BELONGS_TO]->n14, | |
n101-[:BELONGS_TO]->n100, | |
n102-[:BELONGS_TO]->n100, | |
n103-[:BELONGS_TO]->n100, | |
n1011-[:BELONGS_TO]->n101, | |
n1012-[:BELONGS_TO]->n101, | |
n1021-[:BELONGS_TO]->n102, | |
n1022-[:BELONGS_TO]->n102, | |
n1031-[:BELONGS_TO]->n103, | |
n1032-[:BELONGS_TO]->n103, | |
n1033-[:BELONGS_TO]->n103, | |
n1101-[:BELONGS_TO]->n110, | |
n1102-[:BELONGS_TO]->n110, | |
n1103-[:BELONGS_TO]->n110, | |
n1201-[:BELONGS_TO]->n120, | |
n1202-[:BELONGS_TO]->n120, | |
n1203-[:BELONGS_TO]->n120, | |
n1204-[:BELONGS_TO]->n120, | |
n1205-[:BELONGS_TO]->n120, | |
n1206-[:BELONGS_TO]->n120, | |
n1301-[:BELONGS_TO]->n130, | |
n1302-[:BELONGS_TO]->n130, | |
n1303-[:BELONGS_TO]->n130, | |
n14-[:SUITED_FOR]->n1011, | |
n4-[:SUITED_FOR]->n1022, | |
n1-[:SUITED_FOR]->n1101, | |
n1-[:SUITED_FOR]->n1102, | |
n3-[:SUITED_FOR]->n1103, | |
n12-[:SUITED_FOR]->n1103, | |
n13-[:SUITED_FOR]->n1201, | |
n14-[:SUITED_FOR]->n1201, | |
n4-[:SUITED_FOR]->n1202, | |
n3-[:SUITED_FOR]->n1202, | |
n3-[:SUITED_FOR]->n1204, | |
n3-[:SUITED_FOR]->n1205, | |
n14-[:SUITED_FOR]->n1206, | |
n3-[:SUITED_FOR]->n1301, | |
n4-[:SUITED_FOR]->n1301, | |
n14-[:SUITED_FOR]->n1302, | |
n14-[:SUITED_FOR]->n1303; | |
---- | |
//graph | |
So let's look at some of the `questions` that you should ask yourself when preparing for an import: | |
[source,cypher] | |
---- | |
MATCH (n:QuestionCategory)<-[r:BELONGS_TO]-(m) return m.name as Question,type(r),n.name as QuestionCategory; | |
---- | |
//table | |
And then look at some of the `tools` that at your disposal: | |
[source, cypher] | |
---- | |
MATCH (k:ToolCategory)<-[s:BELONGS_TO]-(l) return l.name as Tool,type(s),k.name as ToolCategory; | |
---- | |
//table | |
Let's finally take a look `which tools are suited for which import use cases`: | |
[source, cypher] | |
---- | |
MATCH (f:Tool)-[h:SUITED_FOR]->(g) return f.name as Tool,type(h),g.name as ImportUseCase limit 50; | |
---- | |
//table | |
I hope this was a useful addition to the blog post - hope you enjoyed. | |
Rik | |
//console |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment