Skip to content

Instantly share code, notes, and snippets.

@aneesh-joshi
Last active July 19, 2018 15:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aneesh-joshi/e81ba9fb955dbc37cb831dc006e56523 to your computer and use it in GitHub Desktop.
Save aneesh-joshi/e81ba9fb955dbc37cb831dc006e56523 to your computer and use it in GitHub Desktop.

Resources on QA-Transfer Model

QA-Transfer Model uses:

  • SQUAD-T dataset
  • BiDAF model (with end layers changed)

BiDAF moedel has 3 open source implementations:

I am currently working on getting a working BiDAF, hopefully in keras.

The original code

  • in tf 0.12.1
  • trains very slowly (6 seconds per iteration) without GPU
  • version mismatch of tf 0.11 and recent CUDA drivers
  • tf code is difficult to read and maintain (look at code in the link above to get an idea)

AlenAI code

  • is part of the DeepQA toolkit(now archived/deprecated/closed)
  • uses a lot of internal code (custom layer, models, etc)
  • very well documented

PyTorch code

  • small and contained within itself
  • written in pytorch, so, hard to read and maintain(look at code in the link above to get an idea)
  • I have less faimiliarity with PyTorch

Currently, I tried/am trying to port the original code from tf 0.12.1 to tf 1.3.0 There are some functions which have been removed in 1.3.0 and are making porting difficult. If this feels like it'll fail, I will move to the AllenAI code. Then the PyTorch code.

My Goal

To get a working QA-Transfer model before the meeting on 23 July, 2018.

Links to papers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment