- Introduces fastText, a simple and highly efficient approach for text classification.
- At par with deep learning models in terms of accuracy though an order of magnitude faster in performance.
- Link to the paper
- Link to code
- Introduces a new global log-bilinear regression model which combines the benefits of both global matrix factorization and local context window methods.
- Decompose large matrices into low-rank approximations.
- Algorithm to derive similarity between 2 nodes of a graph (or graphical model derived from any other kind of dataset).
- Link to the paper
- Input: A directed graph G = (V, E) where V represents vertices and E represents edges.
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
- The paper explores the strengths and weaknesses of different evaluation metrics for end-to-end dialogue systems(in unsupervised setting).
- Link to the paper
- Task of translating natural language queries into regular expressions without using domain specific knowledge.
- Proposes a methodology for collecting a large corpus of regular expressions to natural language pairs.
- Reports performance gain of 19.6% over state-of-the-art models.
- Link to the paper
- Large scale natural language understanding task - predict text values given a knowledge base.
- Accompanied by a large dataset generated using Wikipedia
- Link to the paper
- Presents WikiQA - a publicly available set of question and sentence pairs for open-domain question answering.
- Link to the paper
- 3047 questions sampled from Bing query logs.
- Build a supervised reading comprehension data set using news corpus.
- Compare the performance of neural models and state-of-the-art natural language processing model on reading comprehension task.
- Link to the paper
- The paper presents a suite of benchmark tasks to evaluate end-to-end dialogue systems such that performing well on the tasks is a necessary (but not sufficient) condition for a fully functional dialogue agent.
- Link to the paper
- Created using large-scale real-world sources - OMDB (Open Movie Database), MovieLens and Reddit.
- The paper explains how to apply dropout to LSTMs and how it could reduce overfitting in tasks like language modelling, speech recognition, image caption generation and machine translation.
- Link to the paper
- Regularisation method that drops out (or temporarily removes) units in a neural network.