Skip to content

Instantly share code, notes, and snippets.

@danyaljj
Created April 26, 2021 23:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save danyaljj/b12f454ebf18b9cf4d8a4611ea0b03ac to your computer and use it in GitHub Desktop.
Save danyaljj/b12f454ebf18b9cf4d8a4611ea0b03ac to your computer and use it in GitHub Desktop.
from transformers import MT5Config, MT5ForConditionalGeneration, MT5Tokenizer
from transformers.models.t5.modeling_t5 import load_tf_weights_in_t5
model_name = "persiannlp/mt5-base-parsinlu-opus-translation_fa_en"
tokenizer = MT5Tokenizer.from_pretrained(model_name)
model = MT5ForConditionalGeneration.from_pretrained(model_name)
def run_model(input_string, **generator_args):
input_ids = tokenizer.encode(input_string, return_tensors="pt")
res = model.generate(input_ids, **generator_args)
output = tokenizer.batch_decode(res, skip_special_tokens=True)
print(output)
return output
run_model("من در ریاضیات خوب هستم") # your query
run_model("ستایش خدای را که پروردگار جهانیان است.")
run_model("در هاید پارک کرنر بر گلدانی ایستاده موعظه می‌کند؛")
run_model("وی از تمامی بلاگرها، سازمان‌ها و افرادی که از وی پشتیبانی کرده‌اند، تشکر کرد.")
run_model("مشابه سال ۲۰۰۱، تولید آمونیاک بی آب در ایالات متحده در سال ۲۰۰۰ تقریباً ۱۷،۴۰۰،۰۰۰ تن (معادل بدون آب) با مصرف ظاهری ۲۲،۰۰۰،۰۰۰ تن و حدود ۴۶۰۰۰۰۰ با واردات خالص مواجه شد. ")
run_model("می خواهم دکترای علوم کامپیوتر راجع به شبکه های اجتماعی را دنبال کنم، چالش حل نشده در شبکه های اجتماعی چیست؟")
# this gives me the followings:
# ['I am well in math']
# ['the admiration of God, which is the Lord of the world.']
# ['At the Ford Park, the Crawford Park stands on a vase;']
# ['He thanked all the bloggers, the organizations, and the people who supported him']
# ['similar to the year 2001, the economy of ammonia in the United States in the']
# ['I want to follow the computer experts on social networks, what is the unsolved problem in']
# if I change the size to "base", I get the following:
# ["I'm good at math."]
# ['Adoration of God, the Lord of the world.']
# ['At the High End of the Park, Conrad stands on a vase preaching;']
# ['She thanked all the bloggers, organizations, and men who had supported her.']
# ['In 2000, the lack of water ammonia in the United States was almost']
# ['I want to follow the computer science doctorate on social networks. What is the unsolved challenge']
# if I change the size to "large", I get the following:
# ["I'm good at math"]
# ['the praise of God, the Lord of the world.']
# ['At the Hyde Park Corner, Carpenter is preaching on a vase;']
# ['He thanked all the bloggers, organizations, and people who had supported him.']
# ['Similarly in 2001, the production of waterless ammonia in the United States was']
# ['I want to pursue my degree in Computer Science on social networks, what is the']
@humanely
Copy link

Where is the language marker? How does the model know the input and output language?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment