Skip to content

Instantly share code, notes, and snippets.

@mikejs
Created July 12, 2010 21:16
Show Gist options
  • Save mikejs/473073 to your computer and use it in GitHub Desktop.
Save mikejs/473073 to your computer and use it in GitHub Desktop.
# Check that nltk and strfry produce same levenshtein distance on
# a bunch of randomly generated strings
import strfry
import nltk.metrics
import random
for i in xrange(0, 10000):
a = ''.join([chr(random.randint(1, 255)) for x in xrange(0, random.randint(0, 20))])
b = ''.join([chr(random.randint(1, 255)) for x in xrange(0, random.randint(0, 20))])
nt = nltk.metrics.edit_distance(a, b)
st = strfry.levenshtein_distance(a, b)
if nt != st:
print "Got different results for '%s' and '%s'" % (a, b)
print "NLTK: %d" % nt
print "strfry: %d" % st
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment