If you download your personal Twitter archive, you don't quite get the data as JSON, but as a series of .js
files, one for each month (there are meant to replicate the Twitter API respones for the front-end part of the downloadable archive.)
But if you want to be able to use the data in those files, which is far richer than the CSV data, for some analysis or app just run this script.
Run sh ./twitter-archive-to-json.sh
in the same directory as the /tweets
folder that comes with the archive download, and you'll get two files:
tweets.json
— a JSON list of the objectstweets_dict.json
— a JSON dictionary where each Tweet's key is itsid_str
You'll also get a /json-tweets
directory which has the individual JSON files for each month of tweets.
A batch job for creating
json
digests from thejs
archive distribution of the Twitter archive from within thedata
directory could look like:You can then dig into your data at will:
Please note this will update the file modification times for the
*.js
files from the ones provided by the archive to the moment of running the command, due to the-I
ignore switch, which makes rsync copy every file over itself.Adapted from https://unix.stackexchange.com/a/527037/79223