Skip to content

Instantly share code, notes, and snippets.

@shagunsodhani
Created May 25, 2016 17:18
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shagunsodhani/12691b76addf149a224c24ab64b5bdcc to your computer and use it in GitHub Desktop.
Save shagunsodhani/12691b76addf149a224c24ab64b5bdcc to your computer and use it in GitHub Desktop.
Notes for "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks" Paper

Introduction

The paper presents a framework and a set of synthetic toy tasks (classified into skill sets) for analyzing the performance of different machine learning algorithms.

Tasks

  • Single/Two/Three Supporting Facts: Questions where a single(or multiple) supporting facts provide the answer. More is the number of supporting facts, tougher is the task.
  • Two/Three Supporting Facts: Requires differentiation between objects and subjects.
  • Yes/No Questions: True/False questions.
  • Counting/List/Set Questions: Requires ability to count or list objects having a certain property.
  • Simple Negation and Indefinite Knowledge: Tests the ability to handle negation constructs and model sentences that describe a possibility and not a certainty.
  • Basic Coreference, Conjunctions, and Compound Coreference: Requires ability to handle different levels of coreference.
  • Time Reasoning: Requires understanding the use of time expressions in sentences.
  • Basic Deduction and Induction: Tests basic deduction and induction via inheritance of properties.
  • Position and Size Reasoning
  • Path Finding: Find path between locations.
  • Agent's Motivation: Why an agent performs an action ie what is the state of the agent.

Dataset

  • The dataset is available here and the source code to generate the tasks is available here.
  • The different tasks are independent of each other.
  • For supervised training, the set of relevant statements is provided along with questions and answers.
  • The tasks are available in English, Hindi and shuffled English words.

Data Simulation

  • Simulated world consists of entities of various types (locations, objects, persons etc) and of various actions that operate on these entities.
  • These entities have their internal state and follow certain rules as to how they interact with other entities.
  • Basic simulations are of the form: eg Bob go school.
  • To add variations, synonyms are used for entities and actions.
  • Experiments

    Methods

    • N-gram classifier baseline
    • LSTMs
    • Memory Networks (MemNNs)
    • Structured SVM incorporating externally labeled data

    Extensions to Memory Networks

    • Adaptive Memories - learn the number of hops to be performed instead of using the fixed value of 2 hops.
    • N-grams - Use a bag of 3-grams instead of a bag-of-words.
    • Nonlinearity - Apply 2-layer neural network with tanh nonlinearity in the matching function.

    Structured SVM

    • Uses coreference resolution and semantic role labeling (SRL) which are themselves trained on a large amount of data.
    • First train with strong supervision to find supporting statements and then use a similar SVM to find the response.

    Results

    • Standard MemNN outperform N-gram and LSTM but still fail on a number of tasks.
    • MemNNs with Adaptive Memory improve the performance for multiple supporting facts task and basic induction task.
    • MemNNs with N-gram modeling improves results when word order matters.
    • MemNNs with Nonlinearity performs well on Yes/No tasks and indefinite knowledge tasks.
    • Structured SVM outperforms vanilla MemNNs but not as good as MemNNs with modifications.
    • Structured SVM performs very well on path finding task due to its non-greedy search approach.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment