Skip to content

Instantly share code, notes, and snippets.

@seanbenhur
Created December 12, 2021 10:07
Show Gist options
  • Save seanbenhur/65ea030817daf436d5e41c25742e9612 to your computer and use it in GitHub Desktop.
Save seanbenhur/65ea030817daf436d5e41c25742e9612 to your computer and use it in GitHub Desktop.
import pandas as pd
from easynmt import EasyNMT
#read csv
split_0 = pd.read_csv("/content/drive/MyDrive/datasets/Image-Captioning-ACL/splits_0")
#turn that into list
split_0_captions = split_0['caption'].tolist()
#load the mbart50 english to many model
model = EasyNMT('mbart50_en2m')
split_0_tamil_captions = []
for translations in model.translate_stream(split_0_captions, show_progress_bar=True, chunk_size=16, target_lang='ta'):
split_0_tamil_captions.append(translations)
split_0['tamil_captions'] = split_0_tamil_captions
split_0.to_csv("/content/drive/MyDrive/datasets/Image-Captioning-ACL/split_0_tamil_captions.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment