Skip to content

Instantly share code, notes, and snippets.

@avescodes
Created July 3, 2009 03:24
Show Gist options
  • Save avescodes/139884 to your computer and use it in GitHub Desktop.
Save avescodes/139884 to your computer and use it in GitHub Desktop.
def self.ranking_sort(results,query)
results.sort { |a,b| ranking_score(a,query) <=> ranking_score(b,query) }
end
# Compute the two-ordered score for sorting
# Score is of the form [ 1 / # of matches, average distance ]
# This arises because we may have 1 or more query parts that aren't matched at all
# Since those elements return nil we can't exactly set them to 0 or MAX and expect reliable results
# e.g. "Carrot Soup" === ["carrot","soup"] = (0 + 8) / 2 = 4 # Clearly the best result
# "Carrot's, raw" === ["carr...soup"] = (0 + 0?) / 2 = 0 # but setting to 0 breaks this
# # Additionally setting the value arbitrarily high may cull results that are otherwise close but missing a word
def self.ranking_score(item,query)
queries = query.split
text = item.name
matches = 0
total = queries.inject(0) do |sum, query|
# "Carrot SoUp" -> "carrot soup" # we need to match properly
distance = text.downcase.index(query.downcase)
if distance
matches += 1
sum + distance
else
sum
end
end
avg_distance = total / query.size
# Set the reciprocal appropriately so no matches sorts to the end
reciprocal = matches && matches > 0 ? matches : queries.size + 1
#TODO: Try mixing the last two pieces together with coefficients - currently not a composite :(
# e.g. d * avg_distance + l * Absolute Length Difference
# where d and l are coefficients representing the importance of either factor
[ reciprocal, avg_distance, (text.length - query.length).abs]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment