Skip to content

Instantly share code, notes, and snippets.

@epwalsh
Last active May 5, 2021 18:19
Show Gist options
  • Save epwalsh/d34e1f4fd70673e08e2d530ae4d2661d to your computer and use it in GitHub Desktop.
Save epwalsh/d34e1f4fd70673e08e2d530ae4d2661d to your computer and use it in GitHub Desktop.
Running T5-11B predictions in AllenNLP
# Requires allennlp>=2.4.0, allennlp_models>=2.4.0
from allennlp_models.generation.predictors import Seq2SeqPredictor
ARTICLE_TO_SUMMARIZE = '''
summarize: The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building,
and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side.
During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest
man-made structure in the world, a title it held for 41 years until the Chrysler Building in
New York City was finished in 1930. It was the first structure to reach a height of 300 metres.
Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller
than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is
the second tallest free-standing structure in France after the Millau Viaduct.
'''.strip().replace(
"\n", " "
)
BEAM_SIZE = 5
MAX_STEPS = 5
# Takes ~20 minutes to download weights with a fast internet connect, another 2-5 to load them into memory.
predictor = Seq2SeqPredictor.pretrained_t5_for_generation("t5-11b")
predictor._model.t5.beam_search.beam_size = BEAM_SIZE
predictor._model.t5.beam_search.max_steps = MAX_STEPS
# Takes ~15 seconds to generate 5 tokens (MAX_STEPS=5)
predictor.predict(ARTICLE_TO_SUMMARIZE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment