Evaluation Datasets for Word Embeddings
[sample]
hurt offend V 6.81 SYNONYMS
clarify worry V 0.33 NONE
fasten attach V 8.47 HYPER/HYPONYMS
meet introduce V 2.82 NONE
throw kick V 1.66 COHYPONYMS
semantic similarity and relatedness; human judgements obtained by Mechanical Turkers who choose a word pair that has more close relationship than other pair.
[sample]
automobile car 50.000000
river water 49.000000
stairs staircase 49.000000
...
movie shopping 22.000000
shade twigs 22.000000
frost sunny 22.000000
...
muscle tulip 1.000000
bikini pizza 1.000000
bakery zebra 0.000000
evaluation for rare word representations. This one claims higher inter-annotator agreement (IAA) than other datasets (such as Stanford RW, SimVerb-3500)
sleepwalking | somnambulists | 3.88 |
---|---|---|
2mro | tomorrow | 4.00 |
currency | concurrency | 0.13 |
must-see | interesting | 3.06 |
carbinolamine | hemiaminal | 3.88 |
biting_point | clutch | 2.19 |
random_seed | BiLSTM | 1.56 |
black_hole | blackmail | 0.06 |
capture similarity, rather than relatedness or association
word1 | word2 | POS | SimLex999 | conc(w1) | conc(w2) | concQ | Assoc(USF) | SimAssoc333 | SD(SimLex) |
---|---|---|---|---|---|---|---|---|---|
old | new | A | 1.58 | 2.72 | 2.81 | 2 | 7.25 | 1 | 0.41 |
smart | intelligent | A | 9.2 | 1.75 | 2.46 | 1 | 7.11 | 1 | 0.67 |
hard | difficult | A | 8.77 | 3.76 | 2.21 | 2 | 5.94 | 1 | 1.19 |
happy | cheerful | A | 9.55 | 2.56 | 2.34 | 1 | 5.85 | 1 | 2.18 |
hard | easy | A | 0.95 | 3.76 | 2.07 | 2 | 5.82 | 1 | 0.93 |
fast | rapid | A | 8.75 | 3.32 | 3.07 | 2 | 5.66 | 1 | 1.68 |
happy | glad | A | 9.17 | 2.56 | 2.36 | 1 | 5.49 | 1 | 1.59 |
short | long | A | 1.23 | 3.61 | 3.18 | 2 | 5.36 | 1 | 1.58 |
stupid | dumb | A | 9.58 | 1.75 | 2.36 | 1 | 5.26 | 1 | 1.48 |
unbalanced: 8,869 semantic and 10,675 syntactic questions, with 20-70 pairs per category; country:capital relation is over 50% of all semantic questions. Relations in the syntactic part largely the same as MSR.
Athens Greece Baghdad Iraq
Athens Greece Bangkok Thailand
Ashgabat Turkmenistan Conakry Guinea
Ashgabat Turkmenistan Copenhagen Denmark
Kabul Afghanistan Rabat Morocco
Kabul Afghanistan Riga Latvia
Croatia kuna Bulgaria lev
Croatia kuna Cambodia riel
sudden suddenly cheerful cheerfully
sudden suddenly complete completely
simple simpler sharp sharper
simple simpler short shorter
falling fell thinking thought
falling fell vanishing vanished
pear pears elephant elephants
pear pears eye eyes
write writes shuffle shuffles
write writes sing sings
BATS [download]
(link to the original website is not valid now)
- dataset balanced across 4 types of relations (inflectional morphology, derivational morphology, lexicographic semantics, encyclopedic semantics)
- 10 relations of each type, 50 unique pairs per category
- 99,200 questions in total