Skip to content

Instantly share code, notes, and snippets.

@kpumuk
Created December 1, 2008 00:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save kpumuk/30569 to your computer and use it in GitHub Desktop.
Save kpumuk/30569 to your computer and use it in GitHub Desktop.
# Given a set of documents this method returns a list of tags associated with
# those documents ordered by the ones occuring on the most documents. Tags th
# only appear on one doc are excluded from the list.
# If the user supplies a specific tag to exclude it will not be included
# in the list.
def self.related_tags(docs, exclude = nil)
doc_ids = docs[0, 20].map { |doc| doc.id }.join(',')
tags = self.connection.select_all(
"SELECT word_tag_id, COUNT(word_tag_id) FROM word_documents_word_tags
WHERE word_document_id IN (#{doc_ids}) GROUP BY word_tag_id
HAVING COUNT(*) > 1 ORDER BY COUNT(*) DESC LIMIT 26")
tag_ids = tags.map { |tag| tag['word_tag_id'] }
related = WordTag.find(:all, :conditions => { :id => tag_ids })
related.reject! { |tag| tag.name == exclude } if exclude
related
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment