Skip to content

Instantly share code, notes, and snippets.

@pangloss
Last active December 12, 2015 03:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pangloss/4710452 to your computer and use it in GitHub Desktop.
Save pangloss/4710452 to your computer and use it in GitHub Desktop.
GraphTO presentation notes and code snippets
Using Pacer
Darrick Wiebe
dw@xnlogic.com
The most fundamental concept in Pacer
out / in
[ out ]
v --(e)-> v
[ in ]
v.out_e.in_v
v
-> e
-> v
v.in_e.out_v
v
<- e
<- v
v.out
v
-(e)-> v (e is skipped over)
v.out_e.out_v
v
-> e
v -> (reverse)
v.both
v
-(e)-> v
<-(e)- v
Pacer Routes
A Route lets you define a step-by-step way through the graph in an
intuitive way. The power this gives you is incredible.
http://www.youtube.com/watch?v=7kI1d7DMbco
Routes are lazy and chainable
g.v.out_e.in_v.limit(1000).properties.keys.flatten.frequencies
There are a few methods that execute the route immediately
.first
.each
.to_a
Steps in a Route
g [ .v ] [ .out_e ] [ .in_v ] [ .properties ] [ .keys ] [ .flatten ] .frequencies
Paths
[ v, e, v, { .. }, [ .. ], ".." ]
g.v.out_e.in_v.limit(1000).properties.keys.flatten.paths.first
Exploring further: enter PacerXml
g = Pacer.neo4j '/tmp/graphto'
g.v.delete!
PacerXml::Sample.load_100_software g
# >> g.v.count
# 22678
# >> g.e.count
# 24181
Extending Pacer
g.v(type: 'examiner')
module Examiner
module Vertex
def display_name
"examiner #{ self['last-name'] } #{ self.in_edges.count }"
end
end
end
g.v(Examiner, type: 'examiner')
module Examiner
def self.route_conditions
{ type: 'examiner' }
end
end
g.v(Examiner)
module Patent; end
module Examiner
module Route
def patents
self.in(:examiners, Patent)
end
def departments
out(:department)
end
end
end
module Patent; end
module Examiner
def self.route_conditions
{ type: 'examiner' }
end
module Vertex
def display_name
"examiner #{ self['last-name'] } #{ self.in_edges.count }"
end
end
module Route
def patents
self.in(:examiners, Patent)
end
def departments
out(:department)
end
end
end
module Patent
def self.route_conditions
{ type: 'patent' }
end
module Route
def examiners(n = nil)
if n
lookahead(min: n) { |p| p.out_e(:examiners) }
else
out(:examiners, Examiner)
end
end
end
end
Pacer has no 'global graph' but you can easily set your own
module MyApp
Graph = Pacer.neo4j "my/app/graph"
end
def Patent.all(*args, &block)
MyApp::Graph.v(Patent, *args, &block)
end
Without an assumed global graph, working with multiple graphs is easy.
Examining the schema
PacerXml::Sample.structure! g #=> file 'patent-structure.graphml'
Manipulating Data
We can see in the visualization that the examiner rel is not quite right
g.v(type: 'examiner').in_e.labels.frequencies
We can fix it by creating a new relationship and deleting the old one
g.v(Patent).bulk_job do |p|
p.add_edges_to :examiners, p.out(type: 'examiners').out(Examiner)
end
g.v(Patent).out(type: 'examiners').delete!
The Patent could also be cleaned up a little.
g.v(Patent).properties.first
What are the possible values of number-of-claims?
g.v(Patent).frequencies 'number-of-claims'
Set and remove a property
g.v(Patent).bulk_job do |p|
p[:claim_count] = p['number-of-claims'].to_i
p['number-of-claims'] = nil
end
Did it work?
g.v(Patent).frequencies 'number-of-claims'
g.v(Patent).frequencies :claim_count
Some Examiners have a department number property that would make more
sense as a relationship.
g.v(Examiner).properties.limit 10
g.v(Examiner).property?(:department).uniq
g.v(Examiner).property?(:department).uniq.bulk_job do |n|
g.create_vertex type: 'department', department: n
end
Look up examiners from each department and associate them
g.create_key_index :department, :vertex
g.v(type: 'department').bulk_job do |d|
g.v(Examiner, department: d[:department].to_s).add_edges_to :department, d
end
g.v(Patent).examiners.departments
Neo4j Integration
Cypher!
START v=node:node_auto_index(type = 'patent')
MATCH v-[:examiners]->examiner
RETURN v, examiner
query = <<CYPHER
START v=node:node_auto_index(type = 'patent')
MATCH v-[:examiners]->examiner
RETURN v, examiner
CYPHER
Simple cypher query returning a pair of vertices
g.cypher(query).limit(10)
This time just grab the Examiner vertex and wrap it
r = g.cypher(query).limit(10).tails.v(Examiner)
Pacer is about streaming though...
Streaming Cypher queries??
patent_examiners = <<CYPHER
MATCH v-[:examiners]->examiner
RETURN v, examiner
CYPHER
r = g.v(Patent).cypher(patent_examiners).tails.v(Examiner).limit(10)
How does that work?
r.back...
r.paths.first
Chaining Cypher queries!
other_examiners = <<CYPHER
MATCH v<-[:examiners]-()-[:examiners]->other
RETURN v, other
CYPHER
g.v(Patent).cypher(patent_examiners).tails.cypher(other_examiners).limit(100)
Neo4j Path Finding Algorithms
all_patents = g.v(Patent)
all_patents.first.paths_to(all_patents)
Cypher returns paths, we can expand them
g.cypher(query).limit(10).expand
Neo4j Lucene Indices
Boolean logic
g.lucene('type:patent OR type:examiner')
Fuzzy matching
g.create_key_index 'last-name', :vertex
g.lucene('last-name:Fujihara').properties
g.lucene('last-name:Fujihara~').properties
Questions?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment