Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Notes for paper titled A Neural Conversational Model

A Neural Conversational Model

Introduction

Model

  • Neural Conversational Model (NCM)
  • A Recurrent Neural Network (RNN) reads the input sentence, one token at a time, and predicts the output sequence, one token at a time.
  • Learns by backpropagation.
  • The model maximises the cross entropy of correct sequence given its context.
  • Greedy inference approach where predicted output token is used as input to predict the next output token.

Dataset

  • IT HelpDesk dataset of conversations about computer related issues.
  • OpenSubtitles dataset containing movie conversations.

Results

  • The paper has reported some samples of conversations generated by the interaction between human actor and the NCM.
  • NCM reports lower perplexity as compared to n-grams model.
  • NCM outperforms CleverBot in a subjective test involving human evaluators to grade the two systems.

Strengths

  • Domain-agnostic.
  • End-To-End training without handcrafted rules.
  • Underlying architecture (Sequence To Sequence Framework) can be leveraged for machine translation, question answering etc.

Weakness

  • The responses are simple, short and at times inconsistent.
  • The objective function of Sequence To Sequence Framework is not designed to capture the objective of conversational models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment