commonsensepretraining

## csqa.md

      
        
          
            
              
              1 file
            
          
          
            
              
              0 forks
            
          
          
            
              
              0 comments
            
          
          
            
              
              3 stars
            
          
        
        
          
              
          
          
            
                commonsensepretraining
                / csqa.md
            
            
              Created
              August 13, 2019 06:16
            
              
                RoBERTa + CSPT (single model)
              
          
        
      
        
  
      
    Model Name:

RoBERTa + CSPT (single model)
Model Description:

We first train a generation model to generate synthetic data from ConceptNet. We then build the commonsense pre-trained model by finetuning RoBERTa-large model on the synthetic data and Open Mind Common Sense (OMCS) corpus. The final model is finetuned from the pretrained commonsense model on CSQA.
Experiment Details:

Commonsense Pre-training:

epochs: 5
maximum sequence length: 35
learning rate: 3e-5