Skip to content

Instantly share code, notes, and snippets.

@manmohan24nov
Last active January 1, 2022 15:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save manmohan24nov/321aace86412aaed0af58bcccb1b7385 to your computer and use it in GitHub Desktop.
Save manmohan24nov/321aace86412aaed0af58bcccb1b7385 to your computer and use it in GitHub Desktop.
>>> from transformers import T5Tokenizer, T5ForConditionalGeneration
2020-11-03 15:26:26.375782: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
>>> model = T5ForConditionalGeneration.from_pretrained('t5-base')
>>> tokenizer = T5Tokenizer.from_pretrained('t5-base')
>>> text = " ".join(tweet_data)
>>> TEXT_CLEANING_RE = "@\S+|https?:\S+|http?:\S|[^A-Za-z0-9]+"
>>> text = re.sub(TEXT_CLEANING_RE, ' ', str(text).lower()).strip()
>>> Preprocessed_text = "summarize: "+Text
>>> tokens_input = tokenizer.encode(Preprocessed_text,return_tensors="pt", max_length=512, truncation=True)
>>> summary_ids = model.generate(tokens_input,
... min_length=60,
... max_length=180,
... length_penalty=4.0)
>>>
>>> summary = tokenizer.decode(summary_ids[0])
>>> print(summary)
sahara reporters will be covering the uselections2020 on tuesday. a nigerian journalist will be presenting a live coverage of the uselections2020.
a nigerian journalist will be presenting a live coverage of the uselections2020.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment