Delete (very) old tweets obtained from a twitter archive
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
|# Largely copied from http://www.mathewinkson.com/2015/03/delete-old-tweets-selectively-using-python-and-tweepy|
|# However, Mathew's script cannot delete tweets older than something like a year (these tweets are not available from the twitter API)|
|# This script is a complement on first use, to delete old tweets. It uses your twitter archive to find tweets' ids to delete|
|# How to use it :|
|# - download and extract your twitter archive (tweet.js will contain all your tweets with dates and ids)|
|# - put this script in the extracted directory|
|# - complete the secrets to access twitter's API on your behalf and, possibly, modify days_to_keep|
|# - delete the few junk characters at the beginning of tweet.js, until the first '[' (it crashed my json parser)|
|# - review the script !!!! It has not been thoroughly tested, it may have some unexpected behaviors...|
|# - run this script|
|# - forget this script, you can now use Mathew's script for your future deletions|
|# License : Unlicense http://unlicense.org/|
|from datetime import datetime, timedelta, timezone|
|consumer_key = ''|
|consumer_secret = ''|
|access_token = ''|
|access_token_secret = ''|
|days_to_keep = 365|
|auth = tweepy.OAuthHandler(consumer_key, consumer_secret)|
|api = tweepy.API(auth)|
|cutoff_date = datetime.now(timezone.utc) - timedelta(days=days_to_keep)|
|fp = open("tweet.js","r")|
|myjson = json.load(fp)|
|for tweet in myjson:|
|d = datetime.strptime(tweet['created_at'], "%a %b %d %H:%M:%S %z %Y")|
|if d < cutoff_date:|
|print(tweet['created_at'] + " " + tweet['id_str'])|
Feb 25, 2019
I've got the same problem... I don't know how to fix this.
Feb 26, 2019
Ok, I removed this part of the tweet.js file: "window.YTD.tweet.part0 =" and it worked.
Apr 9, 2019
I had into the following error on first run:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 53148: character maps to undefined
Solved it by modifying line 37 to include encoding
fp = open("tweet.js","r", encoding='UTF-8')
Thanks for the script. Very useful!
Jul 29, 2019
@prasket, if it's helpful to you I updated my fork to account for the new js format: https://gist.github.com/AnilRedshift/536d32b9388675d7c98b019d524983a5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
I downloaded my twitter archive to try this out. After unarchiving I could not find a "tweet.js" file but under data/js I do see a tweet_index.js file which is a mapping of data/js/tweets which has many files in year_month.js format.
\I tried dumping the contents of each file other than that in the top line until [ just like oringial instructions but its just throwing errors.
myjson = json.load(fp) File "/usr/lib/python3.6/json/__init__.py", line 299, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.6/json/__init__.py", line 354, in loads return _default_decoder.decode(s) File "/usr/lib/python3.6/json/decoder.py", line 342, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 23 column 2 (char 642)
Anyone else ran into this?