Skip to content

Instantly share code, notes, and snippets.

@bububa
Created April 4, 2012 16:18
Show Gist options
  • Save bububa/2303440 to your computer and use it in GitHub Desktop.
Save bububa/2303440 to your computer and use it in GitHub Desktop.
mysql tf/idf
SELECT id, keyword, item_id, item_type, created_by, EXTRACT, extract_start, extract_end, SUM(occurrences/(SELECT keyword_count FROM keyword_counts WHERE item_id = t.item_id AND created_by = [creator user id goes here] AND item_type = t.item_type) * LOG((SELECT COUNT(1) FROM search_results)/(1+(SELECT COUNT(1) FROM search_results WHERE '.[keyword 'OR' statements go here].')))) AS keyword_density_product FROM (SELECT * FROM search_results WHERE ('.[keyword 'OR' statements go here].') AND created_by = [creator user id goes here] AND item_type = [item type, e.g. 'document', goes here]) t GROUP BY item_id ORDER BY keyword_density_product DESC LIMIT [pagination LIMIT goes here] OFFSET [pagination OFFSET goes here];
http://snipplr.com/view/51513/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment