Skip to content

Instantly share code, notes, and snippets.

@romiras
Created December 30, 2016 12:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save romiras/386e3694a59949f6bef29f11af03531c to your computer and use it in GitHub Desktop.
Save romiras/386e3694a59949f6bef29f11af03531c to your computer and use it in GitHub Desktop.
Simple function for fuzzy string match
require 'active_support/all' # mb_chars
def simple_fuzzy_match(s1, s2)
levenshtein_distance( normalize_str(s1), normalize_str(s2) ) < 2
end
def normalize_str(s)
s.
mb_chars. # convert to multibyte string (ActiveSupport::Multibyte::Chars) - required in Ruby version below 2.4
downcase. # lower case for all characters
strip. # remove whitespace from start and end
split(/\s+/). # RegEx split by spaces into array of words
sort. # sort array of words alphabetically
join(' ') # join back to string by concatenating with space for further comparison by Levenshtein distance
end
### Helper function
# http://stackoverflow.com/questions/16323571/measure-the-distance-between-two-strings-with-ruby
def levenshtein_distance(s, t)
m = s.length
n = t.length
return m if n == 0
return n if m == 0
d = Array.new(m+1) {Array.new(n+1)}
(0..m).each {|i| d[i][0] = i}
(0..n).each {|j| d[0][j] = j}
(1..n).each do |j|
(1..m).each do |i|
d[i][j] = if s[i-1] == t[j-1] # adjust index into string
d[i-1][j-1] # no operation required
else
[ d[i-1][j]+1, # deletion
d[i][j-1]+1, # insertion
d[i-1][j-1]+1, # substitution
].min
end
end
end
d[m][n]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment