Skip to content

Instantly share code, notes, and snippets.

@ymoslem
Created April 16, 2020 15:30
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ymoslem/5174469f88d9f1fb1660121a663bb87f to your computer and use it in GitHub Desktop.
Save ymoslem/5174469f88d9f1fb1660121a663bb87f to your computer and use it in GitHub Desktop.
Compute METEOR score
# Sentence METEOR
# METEOR mainly works on sentence evaluation rather than corpus evaluation
# Run this file from CMD/Terminal
# Example Command: python3 sentence-meteor.py test_file_name.txt mt_file_name.txt
import sys
from nltk.translate.meteor_score import meteor_score
target_test = sys.argv[1] # Test file argument
target_pred = sys.argv[2] # MTed file argument
# Open the test dataset human translation file
with open(target_test) as test:
refs = test.readlines()
#print("Reference 1st sentence:", refs[0])
# Open the translation file by the NMT model
with open(target_pred) as pred:
preds = pred.readlines()
meteor_file = "meteor-" + target_pred + ".txt"
# Calculate METEOR for each sentence and save the result to a file
with open(meteor_file, "w+") as output:
for line in zip(refs, preds):
test = line[0]
pred = line[1]
#print(test, pred)
meteor = round(meteor_score([test], pred), 2) # list of references
#print(meteor, "\n")
output.write(str(meteor) + "\n")
print("Done! Please check the METEOR file '" + meteor_file + "' in the same folder!")
@nabam-del
Copy link

hi, im a new in machine translation, could any please help me to find a command to run the meteor, F-measure and scareBleu in OpenNMT-py?

@ymoslem
Copy link
Author

ymoslem commented Nov 16, 2022

You can only use BLEU during validation in OpenNMT-tf. Any evaluation metrics should be run independently on the test dataset after training a model.

If you have further questions about OpenNMT, please send them to its GitHub repository or its forum.

@nabam-del
Copy link

How to run this in OpenNMT.py

@ymoslem
Copy link
Author

ymoslem commented Nov 16, 2022

The script above is not related to any certain framework. First, you train an NMT model, or you might already have a pretrained NMT mode. Then, you translate your test dataset source with that model. Finally, you use this script or any evaluation tool like sacreBLEU to compare the output MT translation with the reference target translation from your test dataset.

You can find here two notebooks that explain how to use OpenNMT-py:
https://github.com/ymoslem/OpenNMT-Tutorial

I hope this helps.

@KashyapKishore
Copy link

KashyapKishore commented Jan 25, 2023

Dear @ymoslem , the following error is showing up when line no 34 ( meteor = round(meteor_score([test], pred), 2) # list of references) is exactly used without modification.
"hypothesis" expects pre-tokenized hypothesis (Iterable[str]): xxxx xxxxxx xxxxxx .
But, if line no 30, 31 are casted to list type test = list(line[0]) and pred = list(line[1]), then everything is fine. Should it be modified/pulled ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment