Skip to content

Instantly share code, notes, and snippets.

@drussellmrichie
Last active October 4, 2016 02:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save drussellmrichie/4632317aefc7044d8c74 to your computer and use it in GitHub Desktop.
Save drussellmrichie/4632317aefc7044d8c74 to your computer and use it in GitHub Desktop.
A quick and dirty implementation of the cohort model of word recognition (devised by Marslen-Wilson, I believe?)
def cohortModel(word, EnglishWords):
soFar = '' # when we start, we haven't heard anything yet, so we'll represent that as an empty string
candidates = set(EnglishWords) # before we've heard anything, all the words we know are possible
for letter in word: # then start listening to the word letter by letter
soFar += letter # add the newly heard letter to the portion of the word heard so far
for word in set(candidates): # now look through the candidate words
if not word.startswith(soFar): # if the word seen so far is NOT consistent with a word we know
candidates.remove(word) # remove the word from the candidates
print("These are the possible words when we've heard {} so far:\n{}".format(str(soFar),str(candidates)))
return candidates
# run the model on a tiny fragment of English
EnglishWords = set(['cathedral', 'cat','dog','catheter'])
word = 'cathedral'
print("After hearing everything, the only possible word(s) are {}".format(cohortModel(word, EnglishWords)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment