Skip to content

Instantly share code, notes, and snippets.

@HoyaBoya
Created June 24, 2013 11:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save HoyaBoya/5849516 to your computer and use it in GitHub Desktop.
Save HoyaBoya/5849516 to your computer and use it in GitHub Desktop.
Very poor implementation of KMP string search in my attempt to understand the mechanics.
class StringSearch
def search(pattern, text)
pattern_array = pattern.split(//)
text_array = text.split(//)
for i in 0 .. text_array.size - 1
matched_so_far = ""
for j in 0 .. pattern_array.size - 1
# keep adding to the matched_so_far word for every matching character
if text_array[i + j] == pattern_array[j]
matched_so_far += pattern_array[j].to_s
else
# can we can move i and j down potentially?
# if we have a partial match so far, we can!
if matched_so_far.size > 0
# max overlap in the matched pattern is everything we matched minus the next matching pattern
s = max_overlap(matched_so_far)
# accelerate the i counter increment
i = i + j + 1
# accelerate the j counter increment
j = s.size
end
break
end
end
# we found a match since count equals pattern size
if matched_so_far == pattern
return i
end
end
return nil
end
protected
# get the largest proper suffix possible that is also a prefix of s
# this is the foundation of the KMP failure function table
def max_overlap(s)
s_array = s.split(//)
prefixes = []
suffixes = []
for i in 0 .. s_array.size - 1
prefixes << s_array[0 .. i].join('')
end
for i in 0 .. s_array.size - 1
suffixes << s_array[s_array.size - 1 - i .. s_array.size - 1].join('')
end
prefixes.pop
suffixes.pop
biggest_at = -1
for i in 0 .. prefixes.size - 1
if prefixes[i] == suffixes[i]
biggest_at = i
end
end
if biggest_at > -1
prefixes[biggest_at]
else
""
end
end
end
require "test/unit"
class StringSearchTest < Test::Unit::TestCase
def setup
@string_search = StringSearch.new
end
def test_maxoverlap
assert_equal "A", @string_search.send(:max_overlap, "AABA")
assert_equal "AAA", @string_search.send(:max_overlap, "AAAA")
assert_equal "ABC", @string_search.send(:max_overlap, "ABCABC")
assert_equal "", @string_search.send(:max_overlap, "AAB")
end
def test_search
assert_string_search_equal("abcdefg", "abcdefH")
assert_string_search_equal("abadabacb", "abadababaccabacabaabb")
assert_string_search_equal("lo", "hello there!")
assert_string_search_equal("miss!", "hello there!")
assert_string_search_equal("abcabc", "1234abcabDabcabcdXYZ")
end
protected
def assert_string_search_equal(pattern, target)
assert_equal target.index(pattern), @string_search.search(pattern, target)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment