Skip to content

Instantly share code, notes, and snippets.

@murindwaz
Forked from bububa/gist:2303440
Created February 6, 2018 00:16
Show Gist options
  • Save murindwaz/22ce0d544670c8850030c1a8d6a17c97 to your computer and use it in GitHub Desktop.
Save murindwaz/22ce0d544670c8850030c1a8d6a17c97 to your computer and use it in GitHub Desktop.
mysql tf/idf
SELECT id, keyword, item_id, item_type, created_by, EXTRACT, extract_start, extract_end, SUM(occurrences/(SELECT keyword_count FROM keyword_counts WHERE item_id = t.item_id AND created_by = [creator user id goes here] AND item_type = t.item_type) * LOG((SELECT COUNT(1) FROM search_results)/(1+(SELECT COUNT(1) FROM search_results WHERE '.[keyword 'OR' statements go here].')))) AS keyword_density_product FROM (SELECT * FROM search_results WHERE ('.[keyword 'OR' statements go here].') AND created_by = [creator user id goes here] AND item_type = [item type, e.g. 'document', goes here]) t GROUP BY item_id ORDER BY keyword_density_product DESC LIMIT [pagination LIMIT goes here] OFFSET [pagination OFFSET goes here];
http://snipplr.com/view/51513/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment