Instantly share code, notes, and snippets.

Embed
What would you like to do?
A Python script to download all the tweets of a hashtag into a csv
import tweepy
import csv
import pandas as pd
####input your credentials here
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
#####United Airlines
# Open/Create a file to append data
csvFile = open('ua.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="#unitedAIRLINES",count=100,
lang="en",
since="2017-04-03").items():
print (tweet.created_at, tweet.text)
csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])
@bellegis

This comment has been minimized.

bellegis commented Aug 29, 2017

thank you!

@vitospinelli

This comment has been minimized.

vitospinelli commented Jan 3, 2018

When I run this script on Python 27 (on a windows 10 machine) nothing happens and no error is returned... can you please help?

@impshum

This comment has been minimized.

impshum commented Jan 30, 2018

@vitospinelli When you see print(things) not print things wrapped in parentheses you're dealing with python 3. Drop 2 unless you really have to use it. You have 2 years to get used to 3: https://pythonclock.org

@sushovan1

This comment has been minimized.

sushovan1 commented Mar 21, 2018

I tried using the same code on 2018-03-21 and was thinking to fetch tweet as old as 2018-02-01 but it was unable to return those many tweets, any idea why?

@streetratonascooter

This comment has been minimized.

streetratonascooter commented Mar 27, 2018

@sushonvan1 the twitter API only lets you go back approximately 2 weeks

@shruti18196

This comment has been minimized.

shruti18196 commented May 4, 2018

Thanku very much

@qrnazyhah

This comment has been minimized.

qrnazyhah commented May 10, 2018

i want to fetch tweet as old as 2018-01-01, can you help me please?

@kamalikap

This comment has been minimized.

kamalikap commented May 23, 2018

Hi, I want to extract the hashtags from the tweets and store it into a file. is it possible?

@kahiin

This comment has been minimized.

kahiin commented May 24, 2018

Hi, i want to save the tweets that i obtain into an array, is it possible?! thanks

@carlvlewis

This comment has been minimized.

carlvlewis commented Jun 3, 2018

Running this script w/ Python 3.6, it's working just fine, outputting the data and creating the CSV file, but the CSV file appears empty when I open it. Any ideas?

@4lexLammers

This comment has been minimized.

4lexLammers commented Jun 11, 2018

Thanks, the script runs fine on Python 3.5.2. I just would add a .encode('utf-8') in the print command on line 22. Otherwise I got an error when printing tweets to the console.

@jculligan

This comment has been minimized.

jculligan commented Jun 13, 2018

How would I go about adding in location(e.g. geo_id or coordinates) and user_id? I've been going through the Tweepy documentation and Twitter API documentation, but can't find any information to add arguments like tweet.text and tweet.created_at.

Update: After some more digging, I managed to find this output of the json file to find which arguments can be called for information: https://gist.github.com/dev-techmoe/ef676cdd03ac47ac503e856282077bf2

So, I learned that I can call geo, place, and coordinates (tweet.geo, tweet.place, tweet.coordinates), but it doesn't appear to do well for historical data. I maybe pulled 3 out of several thousand so far :/

But it's a handy reference for things like tweet.user.id or tweet.user.screen_name!

I'm still looking for a way to determine if the tweet is a retweet (tweets I'd like to remove in my analysis), but besides the tweet.text beginning with "b'RT @" or if it is an advertisement (e.g. 'Buy 3 for 2 promotion' kinda thing). If anyone has any advice on those, I'd be greatly appreciative!

@rglukins

This comment has been minimized.

rglukins commented Jun 13, 2018

@carlvlewis I had the same problem initially. I changed line 15 from append ('a') to write ('w') and it works just fine.

@Balajigentela

This comment has been minimized.

Balajigentela commented Jun 18, 2018

it working fine...but no data is stored in the file....
please help me...

@acapshaw

This comment has been minimized.

acapshaw commented Jun 25, 2018

having the same issue, nothing is being stored in csv, csnt seem to find the issue

@chuchovalbuena

This comment has been minimized.

chuchovalbuena commented Jun 26, 2018

Thank you so much!

@GabrielDvt

This comment has been minimized.

GabrielDvt commented Jul 16, 2018

Hello.

I'm geting this error:

File "C:\Users\Gabriel\Desktop\tweet\crawling.py", line 22, in
print (tweet.created_at, tweet.text)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 46-46: Non-BMP character not supported in Tk

Can someone help me?

@ogheneinfinitea

This comment has been minimized.

ogheneinfinitea commented Jul 20, 2018

hello, pls how do i get the number of times a user posted a tweet using a particular hashtag

@arfkrnwan

This comment has been minimized.

arfkrnwan commented Jul 21, 2018

if I want to add another attribute for streaming results, where I can see the tutorial. thank you

@rogeriocolonna

This comment has been minimized.

rogeriocolonna commented Sep 4, 2018

Thank you! Works fine!

@arushiyadav

This comment has been minimized.

arushiyadav commented Oct 1, 2018

Im getting error
TweepError Traceback (most recent call last)
in ()
18
19 for tweet in tweepy.Cursor(api.search,q="#indianairways",count=100,
---> 20 lang="en").items():
21 print (tweet.created_at, tweet.text)
22 csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
47
48 def next(self):
---> 49 return self.next()
50
51 def next(self):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
195 if self.current_page is None or self.page_index == len(self.current_page) - 1:
196 # Reached end of current page, get the next page...
--> 197 self.current_page = self.page_iterator.next()
198 self.page_index = -1
199 self.page_index += 1

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
106
107 if self.index >= len(self.results) - 1:
--> 108 data = self.method(max_id=self.max_id, parser=RawParser(), *self.args, **self.kargs)
109
110 if hasattr(self.method, 'self'):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\binder.py in _call(*args, **kwargs)
248 return method
249 else:
--> 250 return method.execute()
251
252 # Set pagination mode

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\binder.py in execute(self)
232 raise RateLimitError(error_msg, resp)
233 else:
--> 234 raise TweepError(error_msg, resp, api_code=api_error_code)
235
236 # Parse the response payload

TweepError: Twitter error response: status code = 401
plz help me with this

@navneetkaur08

This comment has been minimized.

navneetkaur08 commented Oct 7, 2018

In line 23, instead of CsvWriter it should be csvFile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment