Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
A Python script to download all the tweets of a hashtag into a csv
import tweepy
import csv
import pandas as pd
####input your credentials here
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
#####United Airlines
# Open/Create a file to append data
csvFile = open('ua.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="#unitedAIRLINES",count=100,
lang="en",
since="2017-04-03").items():
print (tweet.created_at, tweet.text)
csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])
@bellegis

This comment has been minimized.

Copy link

bellegis commented Aug 29, 2017

thank you!

@vitospinelli

This comment has been minimized.

Copy link

vitospinelli commented Jan 3, 2018

When I run this script on Python 27 (on a windows 10 machine) nothing happens and no error is returned... can you please help?

@impshum

This comment has been minimized.

Copy link

impshum commented Jan 30, 2018

@vitospinelli When you see print(things) not print things wrapped in parentheses you're dealing with python 3. Drop 2 unless you really have to use it. You have 2 years to get used to 3: https://pythonclock.org

@sushovan1

This comment has been minimized.

Copy link

sushovan1 commented Mar 21, 2018

I tried using the same code on 2018-03-21 and was thinking to fetch tweet as old as 2018-02-01 but it was unable to return those many tweets, any idea why?

@streetratonascooter

This comment has been minimized.

Copy link

streetratonascooter commented Mar 27, 2018

@sushonvan1 the twitter API only lets you go back approximately 2 weeks

@shruti18196

This comment has been minimized.

Copy link

shruti18196 commented May 4, 2018

Thanku very much

@qrnazyhah

This comment has been minimized.

Copy link

qrnazyhah commented May 10, 2018

i want to fetch tweet as old as 2018-01-01, can you help me please?

@kamalikap

This comment has been minimized.

Copy link

kamalikap commented May 23, 2018

Hi, I want to extract the hashtags from the tweets and store it into a file. is it possible?

@kahiin

This comment has been minimized.

Copy link

kahiin commented May 24, 2018

Hi, i want to save the tweets that i obtain into an array, is it possible?! thanks

@carlvlewis

This comment has been minimized.

Copy link

carlvlewis commented Jun 3, 2018

Running this script w/ Python 3.6, it's working just fine, outputting the data and creating the CSV file, but the CSV file appears empty when I open it. Any ideas?

@4lexLammers

This comment has been minimized.

Copy link

4lexLammers commented Jun 11, 2018

Thanks, the script runs fine on Python 3.5.2. I just would add a .encode('utf-8') in the print command on line 22. Otherwise I got an error when printing tweets to the console.

@jculligan

This comment has been minimized.

Copy link

jculligan commented Jun 13, 2018

How would I go about adding in location(e.g. geo_id or coordinates) and user_id? I've been going through the Tweepy documentation and Twitter API documentation, but can't find any information to add arguments like tweet.text and tweet.created_at.

Update: After some more digging, I managed to find this output of the json file to find which arguments can be called for information: https://gist.github.com/dev-techmoe/ef676cdd03ac47ac503e856282077bf2

So, I learned that I can call geo, place, and coordinates (tweet.geo, tweet.place, tweet.coordinates), but it doesn't appear to do well for historical data. I maybe pulled 3 out of several thousand so far :/

But it's a handy reference for things like tweet.user.id or tweet.user.screen_name!

I'm still looking for a way to determine if the tweet is a retweet (tweets I'd like to remove in my analysis), but besides the tweet.text beginning with "b'RT @" or if it is an advertisement (e.g. 'Buy 3 for 2 promotion' kinda thing). If anyone has any advice on those, I'd be greatly appreciative!

@rglukins

This comment has been minimized.

Copy link

rglukins commented Jun 13, 2018

@carlvlewis I had the same problem initially. I changed line 15 from append ('a') to write ('w') and it works just fine.

@Balajigentela

This comment has been minimized.

Copy link

Balajigentela commented Jun 18, 2018

it working fine...but no data is stored in the file....
please help me...

@acapshaw

This comment has been minimized.

Copy link

acapshaw commented Jun 25, 2018

having the same issue, nothing is being stored in csv, csnt seem to find the issue

@chuchovalbuena

This comment has been minimized.

Copy link

chuchovalbuena commented Jun 26, 2018

Thank you so much!

@GabrielDvt

This comment has been minimized.

Copy link

GabrielDvt commented Jul 16, 2018

Hello.

I'm geting this error:

File "C:\Users\Gabriel\Desktop\tweet\crawling.py", line 22, in
print (tweet.created_at, tweet.text)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 46-46: Non-BMP character not supported in Tk

Can someone help me?

@ogheneinfinitea

This comment has been minimized.

Copy link

ogheneinfinitea commented Jul 20, 2018

hello, pls how do i get the number of times a user posted a tweet using a particular hashtag

@arfkrnwan

This comment has been minimized.

Copy link

arfkrnwan commented Jul 21, 2018

if I want to add another attribute for streaming results, where I can see the tutorial. thank you

@rogeriocolonna

This comment has been minimized.

Copy link

rogeriocolonna commented Sep 4, 2018

Thank you! Works fine!

@arushiyadav

This comment has been minimized.

Copy link

arushiyadav commented Oct 1, 2018

Im getting error
TweepError Traceback (most recent call last)
in ()
18
19 for tweet in tweepy.Cursor(api.search,q="#indianairways",count=100,
---> 20 lang="en").items():
21 print (tweet.created_at, tweet.text)
22 csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
47
48 def next(self):
---> 49 return self.next()
50
51 def next(self):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
195 if self.current_page is None or self.page_index == len(self.current_page) - 1:
196 # Reached end of current page, get the next page...
--> 197 self.current_page = self.page_iterator.next()
198 self.page_index = -1
199 self.page_index += 1

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
106
107 if self.index >= len(self.results) - 1:
--> 108 data = self.method(max_id=self.max_id, parser=RawParser(), *self.args, **self.kargs)
109
110 if hasattr(self.method, 'self'):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\binder.py in _call(*args, **kwargs)
248 return method
249 else:
--> 250 return method.execute()
251
252 # Set pagination mode

~\AppData\Local\Continuum\anaconda3\lib\site-packages\tweepy\binder.py in execute(self)
232 raise RateLimitError(error_msg, resp)
233 else:
--> 234 raise TweepError(error_msg, resp, api_code=api_error_code)
235
236 # Parse the response payload

TweepError: Twitter error response: status code = 401
plz help me with this

@navneetkaur08

This comment has been minimized.

Copy link

navneetkaur08 commented Oct 7, 2018

In line 23, instead of CsvWriter it should be csvFile.

@sittinurul

This comment has been minimized.

Copy link

sittinurul commented Dec 13, 2018

      it working fine...but no data is stored in the file....

please help me...

@Balajigentela you should run in cmd

@humbleself

This comment has been minimized.

Copy link

humbleself commented Feb 8, 2019

please help me, is giving me this error
TweepError Traceback (most recent call last)
in ()
1 for tweet in tweepy.Cursor(api.search,q="#APC",count=100,
2 lang="en",
----> 3 since="2017-04-03").items():
4 print (tweet.created_at, tweet.text)

D:\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
47
48 def next(self):
---> 49 return self.next()
50
51 def next(self):

D:\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
195 if self.current_page is None or self.page_index == len(self.current_page) - 1:
196 # Reached end of current page, get the next page...
--> 197 self.current_page = self.page_iterator.next()
198 self.page_index = -1
199 self.page_index += 1
D:\anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
106
107 if self.index >= len(self.results) - 1:
--> 108 data = self.method(max_id=self.max_id, parser=RawParser(), *self.args, **self.kargs)
109
110 if hasattr(self.method, 'self'):
D:\anaconda3\lib\site-packages\tweepy\binder.py in _call(*args, **kwargs)
248 return method
249 else:
--> 250 return method.execute()
251
252 # Set pagination mode

D:\anaconda3\lib\site-packages\tweepy\binder.py in execute(self)
232 raise RateLimitError(error_msg, resp)
233 else:
--> 234 raise TweepError(error_msg, resp, api_code=api_error_code)
236 # Parse the response payload
TweepError: Twitter error response: status code = 400

@jodmoreira

This comment has been minimized.

Copy link

jodmoreira commented Feb 12, 2019

It works fine for me! Thanks!

@Prashant-PS

This comment has been minimized.

Copy link

Prashant-PS commented Feb 16, 2019

@humbleself

Regenerate your API Keys and Token. it will fine

@ElTheCatto

This comment has been minimized.

Copy link

ElTheCatto commented Mar 19, 2019

I am having trouble with this code as it is spitting out an error that I am not sure how to solve:
``traceback (most recent call last):
File "c:/Users/Evdru/Documents/School Work/Science/Program Files/main.py", line 1, in
import tweepy
File "C:\Users\Evdru\AppData\Local\Programs\Python\Python37\lib\site-packages\tweepy_init_.py", line 17, in from tweepy.streaming import Stream, StreamListener
File "C:\Users\Evdru\AppData\Local\Programs\Python\Python37\lib\site-packages\tweepy\streaming.py", line 355
def _start(self, async): ^
SyntaxError: invalid syntax`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.