Skip to content

Instantly share code, notes, and snippets.

@jennifersmith
Forked from mneedham/neo_loading.rb
Created June 10, 2012 13:52
Show Gist options
  • Save jennifersmith/2905725 to your computer and use it in GitHub Desktop.
Save jennifersmith/2905725 to your computer and use it in GitHub Desktop.
Loading stuff into neo via the batch API
[ {
"method" : "POST",
"to" : "/node",
"body" : {
"name" : "Jen"
},
"id" : 1111
}, {
"method" : "POST",
"to" : "/index/node/people",
"body" : {
"value": "Jen",
"uri": "{1111}",
"key": "name"
}
}]
curl -d @body.txt -X POST -H "Content-Type: application/json" http://localhost:7474/db/data/batch
# So the problem is inserting data into neo using the batch API. So we have a bunch of people and we want to put them into the graph and also
# add to to the index so that we can search for them.
# The way the batch API works is that you can refer to previous commands by referencing their index in the list of commands (zero indexed)
# e.g. if I want to reference the person added in the first command I would reference that node as {0}
# You should be able to see how that works in the code below.
neo_people_to_load = Set.new
neo_people_to_load << { :name => "Mark Needham", :id => 1 }
neo_people_to_load << { :name => "Jenn Smith", :id => 2 }
neo_people_to_load << { :name => "Chris Ford", :id => 3 }
command_index = 0
people_commands = neo_people_to_load.inject([]) do |acc, person|
acc << [:create_node, {:id => person[:id], :name => person[:name]}]
acc << [:add_node_to_index, "people", "name", person[:name], "{#{command_index}}"]
command_index += 2
acc
end
# So what we want to get is the following:
# [
# [:create_node, {:id=>"1", :name=>"Mark Needham"}], [:add_node_to_index, "people", "name", "Mark Needham", "{0}"],
# [:create_node, {:id=>"2", :name=>"Jenn Smith"}], [:add_node_to_index, "people", "name", "Jenn Smith", "{2}"],
# [:create_node, {:id=>"3", :name=>"Chris Ford"}, [:add_node_to_index, "people", "name", "Chris Ford", "{4}"]
# ]
# OK so I cannot test this but http://docs.neo4j.org/chunked/snapshot/rest-api-batch-ops.html
# indicates we can just use arbitrary IDs, not indexxes. providing they are unique and numeric
# Does that make it a bit simpler?
# Assuming here you have given an auto id to person each time, if not I would use map_indexed and flatten
people_commands = neo_people_to_load.inject([]) do |acc, person|
acc << [:create_node, {:id => person[:id], :name => person[:name]}]
acc << [:add_node_to_index, "people", "name", person[:name], "{#{person[:id]}}"]
acc
end
@ctford
Copy link

ctford commented Jun 10, 2012

How does it know that the value in the :add_node_to_index refers to id? Is id a special field?

@jennifersmith
Copy link
Author

Well - what Mark has there is pseudo code - I am sort of assuming that it transforms to the sample body.txt that I included - but yes you can give each batch operation an integer ID and the {} syntax will insert the relevant content URI when it is time to execute that call. so if id:1 is given to a create-node call that creates node /data/db/node/22 then {1} is substituded with /data/db/node/22

Obviously I am assuming that person[:id] is an integer and unique. If it were not, you could use the same effect by map_indexed as your sln used it.

I guess I thought that the idea of using indexes was not that great and searched for an alternative.

@ctford
Copy link

ctford commented Jun 10, 2012

Agreed. It's definitely nicer without indices, because you can understand the :add_node_to_index without mentally juggling the anaphoric reference to another.

@jennifersmith
Copy link
Author

Yeah - plus if you had a map representing the node which included the ID, there is no reason that these could not be two separate transformations/mapping of a person map.... infact in my clojure version I think I will probably do just that.

@mneedham
Copy link

It's not actually pseudo code! That's the way that the neography gem builds up the commands that it sends to the REST API. I didn't realise that you could specify the ID like you've done - the way I read the docs I thought 'job id' meant its order in the list which obviously creates some evil dependencies. So Jenn's way is cooler.

@jennifersmith
Copy link
Author

I would just structure it like https://gist.github.com/2906469/d55492cdee790e84b994468cdb21ab1daf464bca (the non interleaved sln) for maximum goodness.

We should always write code like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment