Created
July 3, 2009 03:24
-
-
Save avescodes/139884 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def self.ranking_sort(results,query) | |
results.sort { |a,b| ranking_score(a,query) <=> ranking_score(b,query) } | |
end | |
# Compute the two-ordered score for sorting | |
# Score is of the form [ 1 / # of matches, average distance ] | |
# This arises because we may have 1 or more query parts that aren't matched at all | |
# Since those elements return nil we can't exactly set them to 0 or MAX and expect reliable results | |
# e.g. "Carrot Soup" === ["carrot","soup"] = (0 + 8) / 2 = 4 # Clearly the best result | |
# "Carrot's, raw" === ["carr...soup"] = (0 + 0?) / 2 = 0 # but setting to 0 breaks this | |
# # Additionally setting the value arbitrarily high may cull results that are otherwise close but missing a word | |
def self.ranking_score(item,query) | |
queries = query.split | |
text = item.name | |
matches = 0 | |
total = queries.inject(0) do |sum, query| | |
# "Carrot SoUp" -> "carrot soup" # we need to match properly | |
distance = text.downcase.index(query.downcase) | |
if distance | |
matches += 1 | |
sum + distance | |
else | |
sum | |
end | |
end | |
avg_distance = total / query.size | |
# Set the reciprocal appropriately so no matches sorts to the end | |
reciprocal = matches && matches > 0 ? matches : queries.size + 1 | |
#TODO: Try mixing the last two pieces together with coefficients - currently not a composite :( | |
# e.g. d * avg_distance + l * Absolute Length Difference | |
# where d and l are coefficients representing the importance of either factor | |
[ reciprocal, avg_distance, (text.length - query.length).abs] | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment