Skip to content

Instantly share code, notes, and snippets.

@michelkana
Created June 19, 2021 16:43
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save michelkana/9151f9fdaee5eb25aefc15e5bf543d9a to your computer and use it in GitHub Desktop.
with open("path/to/yelp_academic_dataset_review_small.json") as f:
reviews = f.readlines()
orig_reviews = [json.loads(r)['text'].replace('\n','') for r in reviews]
lowercase_reviews = [r.lower() for r in orig_reviews]
truecase_reviews = [truecasing(r) for r in lowercase_reviews]
bleu_scores = [get_bleu([ro], [rt]) for ro, rt in zip(orig_reviews, truecase_reviews)]
round(sum(bleu_scores)/len(bleu_scores), 2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment