Skip to content

Instantly share code, notes, and snippets.

@msszczep
Created July 28, 2016 15:43
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save msszczep/a77c9361b9dfff6f8f57bf772ef5c2a4 to your computer and use it in GitHub Desktop.
Python script to filter "memorable" words from MRC Psycholinguistic Database
from itertools import product
final_answer = set()
for a in open('mrc2.dct'):
cols = a.split(' ', 1)
b = list(cols[0])
c = int(b[28] + b[29] + b[30]) # concreteness rating
i = int(b[31] + b[32] + b[33]) # imagery rating
f = int(b[25] + b[26] + b[27]) # familiarity rating
m1 = int(b[34] + b[35] + b[36]) # meaningfulness rating #1
m2 = int(b[37] + b[38] + b[39]) # meaningfulness rating #2
try:
word = cols[1][:-1].split(' ')[4].split('|')[0]
except:
word = ''
if (f > 400 or i > 400 or m1 > 400 or m2 > 400) and word != '':
final_answer.add(word)
z = sorted(final_answer)
k = 0
for n in product(range(1,7), repeat=5):
print "".join(map(lambda x: str(x), n)), z[k]
k = k + 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment