Skip to content

Instantly share code, notes, and snippets.

@duner

duner/README.md

Last active Aug 2, 2020
Embed
What would you like to do?
Twitter Archive to JSON

If you download your personal Twitter archive, you don't quite get the data as JSON, but as a series of .js files, one for each month (there are meant to replicate the Twitter API respones for the front-end part of the downloadable archive.)

But if you want to be able to use the data in those files, which is far richer than the CSV data, for some analysis or app just run this script.

Run sh ./twitter-archive-to-json.sh in the same directory as the /tweets folder that comes with the archive download, and you'll get two files:

  • tweets.json — a JSON list of the objects
  • tweets_dict.json — a JSON dictionary where each Tweet's key is its id_str

You'll also get a /json-tweets directory which has the individual JSON files for each month of tweets.

#!/usr/bin/env bash
mkdir json-tweets
mkdir .tmp-json-tweets
touch .tmp-tweets.json
touch tweets.json
echo "" > tweets.json
echo "" > .tmp-tweets.json
echo "Processing Tweet.js files..."
for f in tweets/*.js; do
tail -n +2 "$f" > json-"${f%.js}".json
done
echo "Creating tweets.json..."
echo "[ {" >> .tmp-tweets.json
for f in json-tweets/*.json; do
tail -n +2 "$f" | sed '$d' > .tmp-"${f%.js}"
echo "}, {" >> .tmp-"${f%.js}"
cat .tmp-"${f%.js}" >> .tmp-tweets.json
rm .tmp-"${f%.js}"
done
rmdir .tmp-json-tweets
cat .tmp-tweets.json | sed '$d' > tweets.json
echo "} ]" >> tweets.json
rm .tmp-tweets.json
cat tweets.json | jq '. | map({"key": .id_str | tostring, "value": .}) | from_entries' > tweets_dict.json
echo "DONE"
@vineethjose

This comment has been minimized.

Copy link

@vineethjose vineethjose commented Feb 14, 2019

Can someone help with this?

Processing Tweet.js files...
tail: tweets/*.js: No such file or directory
Creating tweets.json...
/Users/xxx/Downloads/twitter/twitter-archive-to-json.sh: line 27: jq: command not found
DONE

@edsu

This comment has been minimized.

Copy link

@edsu edsu commented Apr 25, 2019

You will want to install jq

@thibaultmol

This comment has been minimized.

Copy link

@thibaultmol thibaultmol commented Jul 4, 2019

Is it just me or does twitter now export in a single .js file instead?

@thibaultmol

This comment has been minimized.

Copy link

@thibaultmol thibaultmol commented Jul 4, 2019

nvm, ignore that

@amandabee

This comment has been minimized.

Copy link

@amandabee amandabee commented Nov 21, 2019

It is a giant json these days. You just have to strip off the leading window.YTD.tweet.part0 = to make it valid JSON

@tsuliwaensis

This comment has been minimized.

Copy link

@tsuliwaensis tsuliwaensis commented May 12, 2020

Running the script just returns a zero-byte file for me.

@amandabee

This comment has been minimized.

Copy link

@amandabee amandabee commented May 12, 2020

@tsuliwaensis You shouldn't need to run it anymore. Your archived tweets are already JSON, and once you edit the file to remove window.YTD.tweet.part0 = it will be valid JSON.

@find-evil

This comment has been minimized.

Copy link

@find-evil find-evil commented Aug 2, 2020

It is a giant json these days. You just have to strip off the leading window.YTD.tweet.part0 = to make it valid JSON

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.