Instantly share code, notes, and snippets.

@edsu /replies.py
Last active Nov 23, 2018

Embed
What would you like to do?
Try to get replies to a particular set of tweets, recursively.
#!/usr/bin/env python
"""
Twitter's API doesn't allow you to get replies to a particular tweet. Strange
but true. But you can use Twitter's Search API to search for tweets that are
directed at a particular user, and then search through the results to see if
any are replies to a given tweet. You probably are also interested in the
replies to any replies as well, so the process is recursive. The big caveat
here is that the search API only returns results for the last 7 days. So
you'll want to run this sooner rather than later.
replies.py will read a line oriented JSON file of tweets and look for replies
using the above heuristic. Any replies that are discovered will be written as
line oriented JSON to stdout:
./replies.py tweets.json > replies.json
It also writes a log to replies.log if you are curious what it is doing...which
can be handy since it will sleep for periods of time to work within the
Twitter API quotas.
PS. you'll need to:
pip install python-twitter
and then set the following environment variables for it to work:
- CONSUMER_KEY
- CONSUMER_SECRET
- ACCESS_TOKEN
- ACCESS_TOKEN_SECRET
"""
import sys
import json
import time
import logging
import twitter
import urllib.parse
from os import environ as e
t = twitter.Api(
consumer_key=e["CONSUMER_KEY"],
consumer_secret=e["CONSUMER_SECRET"],
access_token_key=e["ACCESS_TOKEN"],
access_token_secret=e["ACCESS_TOKEN_SECRET"],
sleep_on_rate_limit=True
)
def tweet_url(t):
return "https://twitter.com/%s/status/%s" % (t.user.screen_name, t.id)
def get_tweets(filename):
for line in open(filename):
yield twitter.Status.NewFromJsonDict(json.loads(line))
def get_replies(tweet):
user = tweet.user.screen_name
tweet_id = tweet.id
max_id = None
logging.info("looking for replies to: %s" % tweet_url(tweet))
while True:
q = urllib.parse.urlencode({"q": "to:%s" % user})
try:
replies = t.GetSearch(raw_query=q, since_id=tweet_id, max_id=max_id, count=100)
except twitter.error.TwitterError as e:
logging.error("caught twitter api error: %s", e)
time.sleep(60)
continue
for reply in replies:
logging.info("examining: %s" % tweet_url(reply))
if reply.in_reply_to_status_id == tweet_id:
logging.info("found reply: %s" % tweet_url(reply))
yield reply
# recursive magic to also get the replies to this reply
for reply_to_reply in get_replies(reply):
yield reply_to_reply
max_id = reply.id
if len(replies) != 100:
break
if __name__ == "__main__":
logging.basicConfig(filename="replies.log", level=logging.INFO)
tweets_file = sys.argv[1]
for tweet in get_tweets(tweets_file):
for reply in get_replies(tweet):
print(reply.AsJsonString())
@rwest202

This comment has been minimized.

rwest202 commented Oct 28, 2017

Thanks!

@Allen-Qiu

This comment has been minimized.

Allen-Qiu commented Nov 26, 2017

Could you give a example for tweets_file?
I wonder what should be written in the file

@garinthengineer

This comment has been minimized.

garinthengineer commented Feb 21, 2018

Same deal, how do you prepare the tweets file in the first place?
please provide an example.

@JuanSierra

This comment has been minimized.

JuanSierra commented Mar 5, 2018

This line in tweets.json worked for me:

{"user":{"screen_name": "HumanoidHistory"},"id": 970447777053462528}

@Robertoez

This comment has been minimized.

Robertoez commented Mar 7, 2018

How would this be done in reverse? - as in, you have a certain reply & want to find the ID of the original tweet it was in reply to

@lakshadvani

This comment has been minimized.

lakshadvani commented Mar 8, 2018

I keep getting a key error when i put my consumer_key any workarounds?

@chandanprsharma

This comment has been minimized.

chandanprsharma commented Mar 10, 2018

I am using Twitter4J implementation... Can you tell me how can we get tweets replies there ?

@zakigatez

This comment has been minimized.

zakigatez commented Mar 28, 2018

it doesn't work ???

@mvarnold

This comment has been minimized.

mvarnold commented Apr 20, 2018

@lakshadvani If you define the environ variables before setting them as defaults in line 45, it works for me. Looks like this

from os import environ as e

e["consumer_key"]="Q3BuOCb6YpCcjxRLX80si7iaY"
e['consumer_secret']="8LmPPMK1j40QKn3jKzNP7MSGQXL90ZXfbHLtRbRR6d1mZKVQwe"
e["access_token_key"]="462603440-1WpphlO6HmRcVDZJjglIosJekSOmLLSQ2rkAC3b9"
e['access_token_secret']="4A2vZ2ioqwHJANDAHZTPFoKp9scbN9zTAnhlXbkLboZdZ"

t = twitter.Api(
    consumer_key=e["consumer_key"],
    consumer_secret=e["consumer_secret"],
    access_token_key=e["access_token_key"],
    access_token_secret=e["access_token_secret"],
    sleep_on_rate_limit=True
)
@linuxandchill

This comment has been minimized.

linuxandchill commented May 7, 2018

hi!
what kind of argument are u passing to the def get_replies(tweet) function?
Is it supposed to be a string or a tweet Id or what?

@mrmrn

This comment has been minimized.

mrmrn commented Jun 18, 2018

How can I get a users replies to another tweets and get the original tweet?
I mean let assume user A replied to user B and user C.
I want to retrieve all replies of A. So I want user B as original tweet and user A tweet as reply. And user C tweet as original tweet and user A reply to the original tweet. I want to have all replies of user A with original tweets.
Thanks

@saverymax

This comment has been minimized.

saverymax commented Jul 20, 2018

thanks for this, was going to start from scratch, but I always appreciate a template!

@Michelle170

This comment has been minimized.

Michelle170 commented Jul 23, 2018

Thanks for this. While it seems the replies.log will provide many replies not on this particular tweet ID but for this user's other tweets.

@leeyamkeng

This comment has been minimized.

leeyamkeng commented Jul 23, 2018

Thank you for your code. but I can't get 100 replies per tweet with the following command, although the count is set to 100.

replies = t.GetSearch(raw_query=q, since_id=tweet_id, max_id=max_id, count=100)

I can only get 15 recent replies of the candidate.. Is there any way for me to scrap 100 replies by tweet id?

@santiag080

This comment has been minimized.

santiag080 commented Aug 21, 2018

The replies.log came out empty :/

@arnolem

This comment has been minimized.

arnolem commented Oct 7, 2018

The standard search API searches against a sampling of recent Tweets published in the past 7 days.

@yoshi9696

This comment has been minimized.

yoshi9696 commented Oct 18, 2018

I don't know what is "tweets_file"...
and I want file "replies.log"

@MerleLiuKun

This comment has been minimized.

MerleLiuKun commented Nov 23, 2018

I use the method, but the result for comments count dose not like the twitter.com show. And the count is lack 1. Is the max_id or since_id cause the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment