Skip to content

Instantly share code, notes, and snippets.

View sshleifer's full-sized avatar
🏠
Working from home

Sam Shleifer sshleifer

🏠
Working from home
View GitHub Profile
@sshleifer
sshleifer / summarize.py
Created February 21, 2020 15:02
Example Fairseq bart-large-cnn summary
# pip install fairseq
bart = torch.hub.load('pytorch/fairseq', 'bart.large.cnn')
bart.eval()
article = '(CNN)The Palestinian Authority officially became the 123rd member of the International Criminal Court on Wednesday, a step that gives the court jurisdiction over alleged crimes in Palestinian territories. The formal accession was marked with a ceremony at The Hague, in the Netherlands, where the court is based. The Palestinians signed the ICC\'s founding Rome Statute in January, when they also accepted its jurisdiction over alleged crimes committed "in the occupied Palestinian territory, including East Jerusalem, since June 13, 2014." Later that month, the ICC opened a preliminary examination into the situation in Palestinian territories, paving the way for possible war crimes investigations against Israelis. As members of the court, Palestinians may be subject to counter-charges as well. Israel and the United States, neither of which is an ICC member, opposed the Palestinians\' efforts to join the body.
torch_device = 'cuda'
FRANCE_ARTICLE = ' Marseille, France (CNN)The French prosecutor leading an investigation into the crash of Germanwings Flight 9525 insisted Wednesday that he was not aware of any video footage from on board the plane. Marseille prosecutor Brice Robin told CNN that "so far no videos were used in the crash investigation." He added, "A person who has such a video needs to immediately give it to the investigators." Robin\'s comments follow claims by two magazines, German daily Bild and French Paris Match, of a cell phone video showing the harrowing final seconds from on board Germanwings Flight 9525 as it crashed into the French Alps. All 150 on board were killed. Paris Match and Bild reported that the video was recovered from a phone at the wreckage site. The two publications described the supposed video, but did not post it on their websites. The publications said that they watched the video, which was found by a source close to the investigation. "One can hear cries of \'My God\' in seve
torch_device = 'cuda'
FRANCE_ARTICLE = ' Marseille, France (CNN)The French prosecutor leading an investigation into the crash of Germanwings Flight 9525 insisted Wednesday that he was not aware of any video footage from on board the plane. Marseille prosecutor Brice Robin told CNN that "so far no videos were used in the crash investigation." He added, "A person who has such a video needs to immediately give it to the investigators." Robin\'s comments follow claims by two magazines, German daily Bild and French Paris Match, of a cell phone video showing the harrowing final seconds from on board Germanwings Flight 9525 as it crashed into the French Alps. All 150 on board were killed. Paris Match and Bild reported that the video was recovered from a phone at the wreckage site. The two publications described the supposed video, but did not post it on their websites. The publications said that they watched the video, which was found by a source close to the investigation. "One can hear cries of \'My God\' in seve
@sshleifer
sshleifer / bart_tweet.py
Created March 8, 2020 18:26
Bart Tweet Carbon
from transformers.example_data import LONG_BORING_ARTICLE_ABOUT_TENNIS
from transformers import *
bart = BartForMaskedLM.from_pretrained('bart-large-cnn')
SUMMARY = bart.generate(LONG_BORING_ARTICLE_ABOUT_TENNIS)
# ->
"""
Mark Selby, John Higgins and Ding Junhui were among a number of players who moved effortlessly into the last 16 of the China Open on Wednesday. Selby continued to defy a neck injury to sweep aside fellow Englishman Elliot Slessor with a break of 126 in frame four of their second-round clash. Ding, the home favourite and reigning champion in Beijing, had two breaks of 86 in a convincing 5-1 victory against Mark Davis.
"""

Bart Checklist (Completed):

  • add model/configuration/tokenization classes
  • add conversion scripts
  • add tests
  • finalize
  • copy the python files from the present folder to the main folder and rename them, replacing xxx with your model name,
  • edit the files to replace XXX (with various casing) with your model name
  • copy-paste or create a simple configuration class for your model in the configuration_... file
  • copy-paste or create the code for your model in the modeling_... files (PyTorch and TF 2.0)
### Preamble
from transformers import * # on your pr branch
import torch
torch_device = 'cuda'
tokenizer = BartTokenizer.from_pretrained('bart-large')
FRANCE_ARTICLE = ' Marseille, France (CNN)The French prosecutor leading an investigation into the crash of Germanwings Flight 9525 insisted Wednesday that he was not aware of any video footage from on board the plane. Marseille prosecutor Brice Robin told CNN that "so far no videos were used in the crash investigation." He added, "A person who has such a video needs to immediately give it to the investigators." Robin\'s comments follow claims by two magazines, German daily Bild and French Paris Match, of a cell phone video showing the harrowing final seconds from on board Germanwings Flight 9525 as it crashed into the French Alps. All 150 on board were killed. Paris Match and Bild reported that the video was recovered from a phone at the wreckage site. The two publications described the supposed video, but did not post it on their websites. The publications s
@sshleifer
sshleifer / memory_changes.md
Last active March 23, 2020 20:46
Summary of Bart memory improvement workstream

Summary of Impact

All experiments were run using BartForConditionalGeneration on a batch size of 6 long CNN articles, of uneven length, so some were padded to 1024.

  • transformers/master
    • FWD pass: 6.8 GB
    • generate (9 steps): 7.982GB
  • fairseq/master:
    • forward: 5.0 GB
    • generate (9 steps): 5.3 GB
  • transformers/after_changes
  • FWD pass: 4.8 GB

BartModel (@sshleifer)

Bart is one of the first Seq2Seq models in the library, and achieves state of the art results on text generation tasks, like abstractive summarization. Three sets of pretrained weights are released:

  • bart-large: the pretrained base model
  • bart-large-cnn: the base model finetuned on the CNN/Daily Mail Abstractive Summarization Task
  • bart-large-mnli: the base model finetuned on the MNLI classification task.

Related:

LONG_TENNIS_ARTICLE = """
Andy Murray came close to giving himself some extra preparation time for his w
edding next week before ensuring that he still has unfinished tennis business to
attend to. The world No 4 is into the semi-finals of the Miami Open, but not be
fore getting a scare from 21 year-old Austrian Dominic Thiem, who pushed him to
4-4 in the second set before going down 3-6 6-4, 6-1 in an hour and three quarte
rs. Murray was awaiting the winner from the last eight match between Tomas Berdy
ch and Argentina's Juan Monaco. Prior to this tournament Thiem lost in the secon
d round of a Challenger event to soon-to-be new Brit Aljaz Bedene. Andy Murray p
umps his first after defeating Dominic Thiem to reach the Miami Open semi finals
EN_DE_CONFIG = {
"bert-train-type-embeddings": "true",
"bert-type-vocab-size": "2",
"dec-cell": "gru",
"dec-cell-base-depth": "2",
"dec-cell-high-depth": "1",
"dec-depth": 6,
"dim-emb": "512",
"dim-rnn": "1024", #IGNORE