okabasakal88/rdbmsImport

## rdbmsImport
= SQL --> Neo4j

The main goal is to be able to import data from SQL (or similar RDBMS) into Neo4j, while maintaining Foreign Keys as relationships, using Cypher.  The reason behind using cypher vs. the batch importer is because I want to build a system that allows for multiple imports and updates without having to batch process it.

//console

= Table Structure Requirements:

1. Primary Key column names should use the pattern [TableName]+Id, so for a Contact table, the primary key field should be names ContactId.  This is to prevent overlap of Node Labels.


[source,cypher]
----
FOREACH (i in RANGE(0,25) |
CREATE (n:Contact{ContactId:i, ContactChild1Id:round(rand()*(10-1)), ContactChild2Id: round(rand()*(30-1))})
)
FOREACH (i in RANGE(0,10) |
CREATE (n:ContactChild1 {ContactChild1Id:i}))
FOREACH (i in RANGE(0,30) |
CREATE (n:ContactChild2 {ContactChild2Id:i}))
FOREACH (i in RANGE(0,5) |
CREATE (n:ContactParent {ContactParentId:i, ContactId:round(rand()*(25-1))}))
----


= Import Requirements

:Define children as tables related to a parent table by Foreign Key

1. Primary Key properties should be set as unique constraints on the nodes
2. Children always relate to the parent
3. The Relationship Type should match the Parent Node Label
4. Store the node Primary Key properties as properties in the Relationship


Match Contact to Child 1

[source,cypher]
----
MATCH (p:Contact), (c:ContactChild1 {ContactChild1Id:p.ContactChild1Id})
MERGE (c)-[:Contact {ContactId:p.ContactId,ContactChild1Id:c.ContactChild1Id}]->(p)
----

Match Contact to Child 2
[source, cypher]
----
MATCH (p2:Contact), (c2:ContactChild2 {ContactChild2Id:p2.ContactChild2Id})
MERGE (c2)-[:Contact {ContactId:p2.ContactId,ContactChild2Id:c2.ContactChild2Id}]->(p2)
----

Match ContactParent to Contact
[source, cypher]
----
MATCH (cp:ContactParent), (cc:Contact {ContactId:cp.ContactId})
MERGE (cc)-[:ContactParent {ContactParent:cp.ContactParentId,ContactId:cc.ContactId}]->(cp)
----

Return all contacts and their children for ContactParent with Id 1
[source, cypher]
----
MATCH path =(cc)-->(c:Contact)-->(p:ContactParent { ContactParentId:1 })
RETURN p, collect(DISTINCT c), collect(cc)
LIMIT 5
----

Any thoughts on how to expand this, or improve performance when batching are appreciated.
	= SQL --> Neo4j

	The main goal is to be able to import data from SQL (or similar RDBMS) into Neo4j, while maintaining Foreign Keys as relationships, using Cypher. The reason behind using cypher vs. the batch importer is because I want to build a system that allows for multiple imports and updates without having to batch process it.

	//console

	= Table Structure Requirements:

	1. Primary Key column names should use the pattern [TableName]+Id, so for a Contact table, the primary key field should be names ContactId. This is to prevent overlap of Node Labels.


	[source,cypher]
	----
	FOREACH (i in RANGE(0,25) \|
	CREATE (n:Contact{ContactId:i, ContactChild1Id:round(rand()(10-1)), ContactChild2Id: round(rand()(30-1))})
	)
	FOREACH (i in RANGE(0,10) \|
	CREATE (n:ContactChild1 {ContactChild1Id:i}))
	FOREACH (i in RANGE(0,30) \|
	CREATE (n:ContactChild2 {ContactChild2Id:i}))
	FOREACH (i in RANGE(0,5) \|
	CREATE (n:ContactParent {ContactParentId:i, ContactId:round(rand()*(25-1))}))
	----


	= Import Requirements

	:Define children as tables related to a parent table by Foreign Key

	1. Primary Key properties should be set as unique constraints on the nodes
	2. Children always relate to the parent
	3. The Relationship Type should match the Parent Node Label
	4. Store the node Primary Key properties as properties in the Relationship


	Match Contact to Child 1

	[source,cypher]
	----
	MATCH (p:Contact), (c:ContactChild1 {ContactChild1Id:p.ContactChild1Id})
	MERGE (c)-[:Contact {ContactId:p.ContactId,ContactChild1Id:c.ContactChild1Id}]->(p)
	----

	Match Contact to Child 2
	[source, cypher]
	----
	MATCH (p2:Contact), (c2:ContactChild2 {ContactChild2Id:p2.ContactChild2Id})
	MERGE (c2)-[:Contact {ContactId:p2.ContactId,ContactChild2Id:c2.ContactChild2Id}]->(p2)
	----

	Match ContactParent to Contact
	[source, cypher]
	----
	MATCH (cp:ContactParent), (cc:Contact {ContactId:cp.ContactId})
	MERGE (cc)-[:ContactParent {ContactParent:cp.ContactParentId,ContactId:cc.ContactId}]->(cp)
	----

	Return all contacts and their children for ContactParent with Id 1
	[source, cypher]
	----
	MATCH path =(cc)-->(c:Contact)-->(p:ContactParent { ContactParentId:1 })
	RETURN p, collect(DISTINCT c), collect(cc)
	LIMIT 5
	----

	Any thoughts on how to expand this, or improve performance when batching are appreciated.