Skip to content

Instantly share code, notes, and snippets.

@romanegloo
Last active December 29, 2018 20:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save romanegloo/1d19f3ba671942c3ef7fc56b4f56bf82 to your computer and use it in GitHub Desktop.
Save romanegloo/1d19f3ba671942c3ef7fc56b4f56bf82 to your computer and use it in GitHub Desktop.
Evaluation Datasets for Word Embeddings

Evaluation Datasets for Word Embeddings

semantic similarity of verbs

[sample]

hurt    offend          V       6.81    SYNONYMS
clarify worry           V       0.33    NONE
fasten  attach          V       8.47    HYPER/HYPONYMS
meet    introduce       V       2.82    NONE
throw   kick            V       1.66    COHYPONYMS

semantic similarity and relatedness; human judgements obtained by Mechanical Turkers who choose a word pair that has more close relationship than other pair.

[sample]

automobile car 50.000000
river water 49.000000
stairs staircase 49.000000
...
movie shopping 22.000000
shade twigs 22.000000
frost sunny 22.000000
...
muscle tulip 1.000000
bikini pizza 1.000000
bakery zebra 0.000000

evaluation for rare word representations. This one claims higher inter-annotator agreement (IAA) than other datasets (such as Stanford RW, SimVerb-3500)

sleepwalking somnambulists 3.88
2mro tomorrow 4.00
currency concurrency 0.13
must-see interesting 3.06
carbinolamine hemiaminal 3.88
biting_point clutch 2.19
random_seed BiLSTM 1.56
black_hole blackmail 0.06

capture similarity, rather than relatedness or association

word1 word2 POS SimLex999 conc(w1) conc(w2) concQ Assoc(USF) SimAssoc333 SD(SimLex)
old new A 1.58 2.72 2.81 2 7.25 1 0.41
smart intelligent A 9.2 1.75 2.46 1 7.11 1 0.67
hard difficult A 8.77 3.76 2.21 2 5.94 1 1.19
happy cheerful A 9.55 2.56 2.34 1 5.85 1 2.18
hard easy A 0.95 3.76 2.07 2 5.82 1 0.93
fast rapid A 8.75 3.32 3.07 2 5.66 1 1.68
happy glad A 9.17 2.56 2.36 1 5.49 1 1.59
short long A 1.23 3.61 3.18 2 5.36 1 1.58
stupid dumb A 9.58 1.75 2.36 1 5.26 1 1.48

unbalanced: 8,869 semantic and 10,675 syntactic questions, with 20-70 pairs per category; country:capital relation is over 50% of all semantic questions. Relations in the syntactic part largely the same as MSR.

Athens Greece Baghdad Iraq
Athens Greece Bangkok Thailand
Ashgabat Turkmenistan Conakry Guinea
Ashgabat Turkmenistan Copenhagen Denmark
Kabul Afghanistan Rabat Morocco
Kabul Afghanistan Riga Latvia
Croatia kuna Bulgaria lev
Croatia kuna Cambodia riel
sudden suddenly cheerful cheerfully
sudden suddenly complete completely
simple simpler sharp sharper
simple simpler short shorter
falling fell thinking thought
falling fell vanishing vanished
pear pears elephant elephants
pear pears eye eyes
write writes shuffle shuffles
write writes sing sings

BATS [download]

(link to the original website is not valid now)

  • dataset balanced across 4 types of relations (inflectional morphology, derivational morphology, lexicographic semantics, encyclopedic semantics)
  • 10 relations of each type, 50 unique pairs per category
  • 99,200 questions in total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment