Skip to content

Instantly share code, notes, and snippets.

@ymoslem
Last active March 4, 2020 16:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ymoslem/f1783b566b3a17b4107a34198daee6a6 to your computer and use it in GitHub Desktop.
Save ymoslem/f1783b566b3a17b4107a34198daee6a6 to your computer and use it in GitHub Desktop.
Compute WER score for each sentence
# Sentence WER
# WER for segment by segment with arguments
# Run this file from CMD/Terminal
# Example Command: python3 sentence-wer.py test_file_name.txt mt_file_name.txt
import sys
from jiwer import wer
target_test = sys.argv[1] # Test file argument
target_pred = sys.argv[2] # MTed file argument
# Open the test dataset human translation file
with open(target_test) as test:
refs = test.readlines()
#print("Reference 1st sentence:", refs[0])
# Open the translation file by the NMT model
with open(target_pred) as pred:
preds = pred.readlines()
wer_file = "wer-" + target_pred + ".txt"
# Calculate WER for sentence by sentence and save the result to a file
with open(wer_file, "w+") as output:
for line in zip(refs, preds):
test = line[0]
pred = line[1]
#print(test, pred)
wer_score = wer(test, pred, standardize=True) # "standardize" expands abbreviations
#print(wer_score, "\n")
output.write(str(wer_score) + "\n")
print("Done! Please check the WER file '" + wer_file + "' in the same folder!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment