Skip to content

Instantly share code, notes, and snippets.

@shagunsodhani
Created July 9, 2016 12:53
Show Gist options
  • Save shagunsodhani/ec6835964df0e49fdef0459c8b334b94 to your computer and use it in GitHub Desktop.
Save shagunsodhani/ec6835964df0e49fdef0459c8b334b94 to your computer and use it in GitHub Desktop.
Notes for paper titled A Neural Conversational Model

A Neural Conversational Model

Introduction

Model

  • Neural Conversational Model (NCM)
  • A Recurrent Neural Network (RNN) reads the input sentence, one token at a time, and predicts the output sequence, one token at a time.
  • Learns by backpropagation.
  • The model maximises the cross entropy of correct sequence given its context.
  • Greedy inference approach where predicted output token is used as input to predict the next output token.

Dataset

  • IT HelpDesk dataset of conversations about computer related issues.
  • OpenSubtitles dataset containing movie conversations.

Results

  • The paper has reported some samples of conversations generated by the interaction between human actor and the NCM.
  • NCM reports lower perplexity as compared to n-grams model.
  • NCM outperforms CleverBot in a subjective test involving human evaluators to grade the two systems.

Strengths

  • Domain-agnostic.
  • End-To-End training without handcrafted rules.
  • Underlying architecture (Sequence To Sequence Framework) can be leveraged for machine translation, question answering etc.

Weakness

  • The responses are simple, short and at times inconsistent.
  • The objective function of Sequence To Sequence Framework is not designed to capture the objective of conversational models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment