Skip to content

Instantly share code, notes, and snippets.

@kpumuk
Created December 1, 2008 00:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kpumuk/30566 to your computer and use it in GitHub Desktop.
Save kpumuk/30566 to your computer and use it in GitHub Desktop.
# Given a set of documents this method returns a list of tags associated with
# ordered by the ones occuring on the most documents. Tags that only appear o
# If the user supplies a specific tag to exclude it will not be included in t
def self.related_tags(docs,exclude = nil)
related = {}
docs.each_with_index do |doc, i|
break if i >= 20 # only consider the first 10 docs
doc.word_tags.each do |tag| # count num times each tag occurs
next if exclude && tag.name == exclude # if caller specified a tag to e
related[tag] ||= 0
related[tag] += 1
end
end
related = related.sort { |a,b| a[1]<=>b[1] }.reverse[0..24]
# now related is an array of arrays [tag,count] ordered by count
related = related.collect { |tag| tag[0] if tag[1] > 1 }.compact
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment