Skip to content

Instantly share code, notes, and snippets.

@worldsayshi
Created December 14, 2012 16:43
Show Gist options
  • Save worldsayshi/4286799 to your computer and use it in GitHub Desktop.
Save worldsayshi/4286799 to your computer and use it in GitHub Desktop.
Algorithm for ranking and grouping properties of a set of ranked articles.
Grouping Algorithm ('ranked articles') =
'input groups' : [{'Property Names'}] <- trivially preprocess 'ranked articles'
'set of all graphs' <- 'Identify separate graphs' using 'input groups'
groups : {Group} <- empty
for each graph in 'set of all graphs'
'spanning tree' <- 'Find minimal spanning tree' of graph
groups += 'Split spanning tree' 'spanning tree' into groups
groups <- merge groups smaller than (g_min : Int)
'ranked groups' <- rank groups by average scoring
return 'ranked groups'
Identify separate graphs ('input groups') =
for each group in 'input groups'
'graphs to merge' <- in 'set of graphs' 'take out graphs containing any field'* of g
new graph <- merge* 'graphs to merge' using group
add new graph to 'set of graphs'
return 'set of graphs'
Split spanning tree ('spanning tree') =
node <- random node of 'spanning tree'
groups : {Group}
while 'spanning tree' is not empty
group : Group = empty
while 'spanning tree' is not empty and group < g_max
group += 'take neighbour with least concecutive neighbours'* from 'spanning tree'
if group.size not 0
groups += group
return groups
'Find minimal spanning tree' = Kruskal*
------
Requirements induced on data structures:
spanning tree:
for each node:
ranking by number of neighbours (treemap : (node:NrNeighbours))
property name:
name : String
relevancy score : Num
Meaning:
'ident 1' -> ident_1
ident* = Definition missing
------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment