Skip to content

Instantly share code, notes, and snippets.

@commonsensepretraining
commonsensepretraining / csqa.md
Created August 13, 2019 06:16
RoBERTa + CSPT (single model)

Model Name:

RoBERTa + CSPT (single model)

Model Description:

We first train a generation model to generate synthetic data from ConceptNet. We then build the commonsense pre-trained model by finetuning RoBERTa-large model on the synthetic data and Open Mind Common Sense (OMCS) corpus. The final model is finetuned from the pretrained commonsense model on CSQA.

Experiment Details:

Commonsense Pre-training:

  • epochs: 5
  • maximum sequence length: 35
  • learning rate: 3e-5