Skip to content

Instantly share code, notes, and snippets.

@edsu
Last active December 1, 2016 15:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edsu/d217fe712011e3078aa49357cdab52be to your computer and use it in GitHub Desktop.
Save edsu/d217fe712011e3078aa49357cdab52be to your computer and use it in GitHub Desktop.
#!/bin/bash
#
# Before running this you'll want to
#
# 0. find a server you can run a process on for a while
# 1. pip install twarc
# 2. twarc configure
# 3. mkdir ids
# 4. add files at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2FPDI7IN to ids directory
# 5. mkdir tweets
# 6. start a screen session so you can close your terminal
# 7. ./hydrate.sh
#
# Then watch the twarc.log to see what's going on and wait a few weeks for it to finish.
#
for file in $( ls ids | grep -v README ); do
echo "hydrating $file"
twarc hydrate ids/$file | gzip - > tweets/$file.gz
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment