Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
get all tweets between two dates from a specified twitter user
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import tweepy
import datetime
import xlsxwriter
import sys
# credentials from https://apps.twitter.com/
consumerKey = "CONSUMER_KEY"
consumerSecret = "CONSUMER_SECRET"
accessToken = "ACCESS_TOKEN"
accessTokenSecret = "ACCESS_TOKEN_SECRET"
auth = tweepy.OAuthHandler(consumerKey, consumerSecret)
auth.set_access_token(accessToken, accessTokenSecret)
api = tweepy.API(auth)
username = sys.argv[1]
startDate = datetime.datetime(2014, 6, 1, 0, 0, 0)
endDate = datetime.datetime(2015, 1, 1, 0, 0, 0)
tweets = []
tmpTweets = api.user_timeline(username)
for tweet in tmpTweets:
if tweet.created_at < endDate and tweet.created_at > startDate:
tweets.append(tweet)
while (tmpTweets[-1].created_at > startDate):
print("Last Tweet @", tmpTweets[-1].created_at, " - fetching some more")
tmpTweets = api.user_timeline(username, max_id = tmpTweets[-1].id)
for tweet in tmpTweets:
if tweet.created_at < endDate and tweet.created_at > startDate:
tweets.append(tweet)
workbook = xlsxwriter.Workbook(username + ".xlsx")
worksheet = workbook.add_worksheet()
row = 0
for tweet in tweets:
worksheet.write_string(row, 0, str(tweet.id))
worksheet.write_string(row, 1, str(tweet.created_at))
worksheet.write(row, 2, tweet.text)
worksheet.write_string(row, 3, str(tweet.in_reply_to_status_id))
row += 1
workbook.close()
print("Excel file ready")
@rorz

This comment has been minimized.

Copy link

commented Mar 22, 2017

This was really helpful, thanks!

👍

@eledroos

This comment has been minimized.

Copy link

commented Apr 2, 2017

Why do you do tmpTweets[-1].id for setting the max_id?

@richardtwang

This comment has been minimized.

Copy link

commented Apr 22, 2017

does this surpass the 3.2k tweet limit?

@mohir

This comment has been minimized.

Copy link

commented Nov 26, 2017

Will it work with old dates?
I read Twitter returns only results from the last 7 days

@inigojpuente

This comment has been minimized.

Copy link

commented Feb 8, 2018

Two suggestions:
1.- The default count in API.user_timeline is 20 tweets. You can actually decrease the number of API calls (while loop iterations) by increasing the value of count.
2.- The current implementation actually duplicates tweets. This is due to the fact that the max_id param returns "Returns only statuses with an ID less than (that is, older than) or equal to the specified ID", according to the Tweepy API Reference documentation. A simple solution is to check if the tweet with current ID within the for loop is already present in the tweets results list .

Cheers,

@teja58

This comment has been minimized.

Copy link

commented Mar 3, 2018

I m getting this error.Can anybody help me to solve this error?
I m enclosing the picture link of the error below.

Thank you
tweet upd

@saipardhu

This comment has been minimized.

Copy link

commented Mar 12, 2018

@teja58, Looks like your API tokens are expired, regenerate them from your twitter developer console and you should do fine. Let me know

@josessandeep

This comment has been minimized.

Copy link

commented Jun 12, 2018

Is there a way to get it from a specific hashtag and rather not by specific username

@chinmaydas96

This comment has been minimized.

Copy link

commented Sep 30, 2018

Is there a way to get it from a specific hashtag and rather not by specific username

Just do it by placing the hashtag in the search query in tweepy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.