Skip to content

Instantly share code, notes, and snippets.

@tomnomnom
Last active November 23, 2024 23:31
Show Gist options
  • Save tomnomnom/93dfb0f4a427e7d3c179d42b8921c80e to your computer and use it in GitHub Desktop.
Save tomnomnom/93dfb0f4a427e7d3c179d42b8921c80e to your computer and use it in GitHub Desktop.
Plotting tweets over time

Plotting tweets with gnuplot

Guide to producing a chart like this one.

This is a bit of a hack job, natch.

Grab your Twitter archive and extract it. You need to find data/tweet-headers.js and make a copy of it:

cp data/tweet-headers.js tweets.json

The data itself is valid JSON so you need to remove the variable name and equals at the top of the file in tweets.json:

window.YTD.tweet_headers.part0 = [
  {
    "tweet" : {
      "tweet_id" : "1858951780102443094",
      "user_id" : "17440273",
      "created_at" : "Tue Nov 19 19:13:36 +0000 2024"
    }
  },
  // ...

Make it look like this:

[
  {
    "tweet" : {
      "tweet_id" : "1858951780102443094",
      "user_id" : "17440273",
      "created_at" : "Tue Nov 19 19:13:36 +0000 2024"
    }
  },

We want to parse that created_at field and get just the year-month-day and time of day into a file for gnuplot to read. I've written a little python script to do that (tz-adjust.py) , for which you may need to pip3 install pytz.

Sorry, I don't really know python so I might have done something silly in this bit, but it seemed like a good choice at the time. You might want to change the timezone in the script to match your own.

You'll need jq too for this next bit:

jq -r '.[].tweet.created_at' < tweets.json | python3 tz-adjust.py > tweet-times.txt

This is using jq to extract the created time, the python script to parse the time and adjust the timezone, and write the results to tweet-times.txt

Now we can run the gnuplot script (you'll need to install gnuplot, obv):

gnuplot tweets.gnuplot

That should produce tweet-times.png

You'll probably want to adjust the title etc so it doesn't say "Tweets by TomNomNom", and the date range (set xrange ["2008-01-01":"2025-01-01"]) to match your own data.

set datafile separator " "
set terminal pngcairo size 1200,800
set output "tweet-times.png"
set title "Tweets by TomNomNom"
set xlabel "Year"
set ylabel "Time of day (GMT)"
set xdata time
set timefmt "%Y-%m-%d"
set format x "%Y"
set xrange ["2008-01-01":"2025-01-01"]
set ydata time
set timefmt "%H:%M"
set format y "%H:%M"
set yrange ["00:00":"23:59"]
set grid
set style data points
set pointsize 0.5
set key off
plot 'tweet-times.txt' using (timecolumn(1,"%Y-%m-%d")):2
import sys
import pytz
from datetime import datetime
output_tz = pytz.timezone('Europe/London')
for line in sys.stdin:
dt = datetime.strptime(line.rstrip(), '%a %b %d %H:%M:%S %z %Y')
local = dt.astimezone(output_tz)
print(local.strftime('%Y-%m-%d %H:%M'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment