-
-
Save manmohan24nov/321aace86412aaed0af58bcccb1b7385 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>>> from transformers import T5Tokenizer, T5ForConditionalGeneration | |
2020-11-03 15:26:26.375782: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 | |
>>> model = T5ForConditionalGeneration.from_pretrained('t5-base') | |
>>> tokenizer = T5Tokenizer.from_pretrained('t5-base') | |
>>> text = " ".join(tweet_data) | |
>>> TEXT_CLEANING_RE = "@\S+|https?:\S+|http?:\S|[^A-Za-z0-9]+" | |
>>> text = re.sub(TEXT_CLEANING_RE, ' ', str(text).lower()).strip() | |
>>> Preprocessed_text = "summarize: "+Text | |
>>> tokens_input = tokenizer.encode(Preprocessed_text,return_tensors="pt", max_length=512, truncation=True) | |
>>> summary_ids = model.generate(tokens_input, | |
... min_length=60, | |
... max_length=180, | |
... length_penalty=4.0) | |
>>> | |
>>> summary = tokenizer.decode(summary_ids[0]) | |
>>> print(summary) | |
sahara reporters will be covering the uselections2020 on tuesday. a nigerian journalist will be presenting a live coverage of the uselections2020. | |
a nigerian journalist will be presenting a live coverage of the uselections2020. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment