Skip to content

Instantly share code, notes, and snippets.

@bluebossa63
Created June 28, 2021 00:26
Show Gist options
  • Save bluebossa63/e1dbdad9c0bc4fd625e072bfe304cf42 to your computer and use it in GitHub Desktop.
Save bluebossa63/e1dbdad9c0bc4fd625e072bfe304cf42 to your computer and use it in GitHub Desktop.
delete old tweets from archive
#!/bin/python3
# Largely copied from http://www.mathewinkson.com/2015/03/delete-old-tweets-selectively-using-python-and-tweepy
# However, Mathew's script cannot delete tweets older than something like a year (these tweets are not available from the twitter API)
# This script is a complement on first use, to delete old tweets. It uses your twitter archive to find tweets' ids to delete
# How to use it :
# - download and extract your twitter archive (tweet.js will contain all your tweets with dates and ids)
# - put this script in the extracted directory
# - complete the secrets to access twitter's API on your behalf and, possibly, modify days_to_keep
# - delete the few junk characters at the beginning of tweet.js, until the first '[' (it crashed my json parser)
# - review the script !!!! It has not been thoroughly tested, it may have some unexpected behaviors...
# - run this script
# - forget this script, you can now use Mathew's script for your future deletions
#
# License : Unlicense http://unlicense.org/
import tweepy
import json
from datetime import datetime, timedelta, timezone
from dateutil.parser import parse
import os
from os import path
consumer_key = 'cwpc33VDHfGm0L7CP0p4xtZU7'
consumer_secret = 'Q7izFcO3OT6AvcDElZQZThezmLIWIBQIlXwUB0jmjhcs582F66'
access_token = '81061781-7NNtWyz11k2d3OiaeLFzMGVQZQmuiM07Ap1IoFaa2'
access_token_secret = 'tgW3i6G09qnPcMo2sUWHtU8eXIgFlm7TEVtMiRiyiB9gQ'
days_to_keep = 7
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
cutoff_date = datetime.now(timezone.utc) - timedelta(days=days_to_keep)
print(cutoff_date)
data_dir = path.join(os.getcwd(), 'data')
tweets = []
with open(path.join(data_dir, "tweet.js"), 'r', encoding='UTF-8') as fp:
js_file = fp.read()
contents = json.loads(js_file)
tweets.extend(contents)
for tweet in tweets:
t = tweet['tweet']
d = parse(t['created_at'])
if d < cutoff_date:
try:
api.destroy_status(t['id_str'])
print(t['created_at'] + " " + t['id_str'])
except tweepy.TweepError as e:
print("failed to delete " + t['id_str'])
print (e.api_code)
print (e.reason)
pass
@Defying
Copy link

Defying commented Mar 20, 2022

You need to run sudo pip3 install python-dateutil :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment