- Come up with a way of evaluating models (in the form of a script)
- Look for more data sets to evaluate models
- WikiQA : [Ranking/Regression]
- QuoraQP [Binary Classification]
- The Stanford Natural Language Inference (SNLI) Corpus [Multi Class Classification]