Here is an example of importing distinct data from a CSV file, and creating relationships using that data.
This method is more efficient than just using MERGE. It never tries to match any duplicates from the csv file as they are filtered out beforehand. It still uses MERGE to ensure that duplicate nodes are not created, but in this situation this would only be required if the csv file was loaded more than once.
CREATE INDEX ON :Person(id);
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "https://dl.dropboxusercontent.com/u/2900504/people2.csv" AS line
WITH DISTINCT line
MERGE (followed:Person {id: toInt(line.followed_id)})
ON CREATE
SET followed.status = line.status, followed.created_at = line.created_at
ON MATCH
SET followed.status = line.status, followed.created_at = line.created_at
MERGE (follower:Person {id: toInt(line.follower_id)})
CREATE UNIQUE (follower)-[:Following]->(followed)
MATCH (p1:Person)
OPTIONAL MATCH (p1)-[r]->(p2:Person)
RETURN p1.id AS Person1, type(r) AS Relationship, p2.id AS Person2
CREATE INDEX ON :Person(id);
provides for faster searching when matching on id.
USING PERIODIC COMMIT 1000
is used to ensure that memory is not filled up before the results of this load are committed to the database.
The ON CREATE
and ON MATCH
statements do the same thing, so that if a Person node is created as a result of being the follower_id, it still gets its status and created_at data added when that information is loaded from the CSV.