Skip to content

Instantly share code, notes, and snippets.

View emaadmanzoor's full-sized avatar

Emaad Manzoor emaadmanzoor

View GitHub Profile
@emaadmanzoor
emaadmanzoor / ExpandEdinburghFSDCorpus.md
Last active October 31, 2020 20:30
Expand the Edinburgh Twitter FSD corpus

Expand The Edinburgh Twitter FSD Corpus

The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:

  • Respect the Twitter API rate limits and throttle API hits.
  • Don't hit the API for already expanded tweet ID's, so you can resume tweet expansion after stopping midway.
  • Parse the API response and dump it into the correct column in the sqlite3 database.
  • Gracefully handle exceptions while acquiring tweets from the API.
  • Wrap version 1.1 of the Twitter API.
  • Start from a specified tweet ID, assuming the input file is sorted in increasing order of tweet ID.