Skip to content

Instantly share code, notes, and snippets.

@rjurney
Created May 10, 2011 20:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjurney/965328 to your computer and use it in GitHub Desktop.
Save rjurney/965328 to your computer and use it in GitHub Desktop.
Creating a new summary graph from an old one...
#
# The purpose of this script is to experiment with Pacer/Tinkerpop stack for
# graph transformation. In it we will summarize a much larger graph to produce
# a new, smaller graph that can fit into RAM via TinkerGraph for more rapid,
# real-time analysis.
#
require 'rubygems'
require 'pacer'
require 'pacer-neo4j'
graph = Pacer.neo4j "/tmp/neo4j"
# Summarize relationships by computing raw, non-normalized weights between them
# - the number of emails sent and successfully recieved between each ego.
senders = graph.v.filter {|v| v[:type] == 'Email Address'}
groupings = senders.group.
key_route { |sender| sender[:address] }.
values_route(:sender) { |sender| sender.out_e('SENT').in_v(:type == 'Message').
out_e('RECEIVED_BY').in_v(:type == 'Email Address')[:address] }
result = groupings.reduce(proc { Hash.new(0) }, :sender) { |h, e| h[e] += 1; h }
puts "Summary computed..."
# To verify our results:
puts "tim.belden@enron.com => [" + result['tim.belden@enron.com'].sort{|a,b| b[1] <=> a[1]}[0][0] + ","
puts " " + result['tim.belden@enron.com'].sort{|a,b| b[1] <=> a[1]}[0][1].to_s + "]"
# Now create a new in-RAM TinkerGraph containing these summaries.
summary_graph = Pacer.tg
# Create vertices for senders
vertices = Hash.new
result.each_with_index do |sender, recipients|
summary_graph.create_vertex :type => 'ego', :address => sender
end
# Now what? No matter what I do, I cannot make any edges on this new graph. I've looked at the source a lot, and I still can't pull it off. I don't know what is expected in the create_edge call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment