Skip to content

Instantly share code, notes, and snippets.

@vickyqian
Last active July 23, 2023 16:52
Show Gist options
  • Save vickyqian/f70e9ab3910c7c290d9d715491cde44c to your computer and use it in GitHub Desktop.
Save vickyqian/f70e9ab3910c7c290d9d715491cde44c to your computer and use it in GitHub Desktop.
A Python script to download all the tweets of a hashtag into a csv
import tweepy
import csv
import pandas as pd
####input your credentials here
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
#####United Airlines
# Open/Create a file to append data
csvFile = open('ua.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="#unitedAIRLINES",count=100,
lang="en",
since="2017-04-03").items():
print (tweet.created_at, tweet.text)
csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])
@Geethanjali8989
Copy link

it working fine...but no data is stored in the file....
please help me...

hey i am also getting the same error did u sove that

@barthuin
Copy link

it working fine...but no data is stored in the file....
please help me...

hey i am also getting the same error did u sove that

Hello,

It's my first time with python and this library and I have a strange error. In my query, i don't have all the tweets, only the 25-30 last tweets. But if I do the same search in Twitter, I have a lot of tweets in the results. Anyone knows why?

@this-is-shashank
Copy link

Running this script w/ Python 3.6, it's working just fine, outputting the data and creating the CSV file, but the CSV file appears empty when I open it. Any ideas?

I am also facing same issues. Can you tell me how you fixed it?

@this-is-shashank
Copy link

ua.csv is an empty file ...can you please guide on how to get the data in ua.csv
thanks a lot in advance please do help !!

I am also facing same issues. Can you tell me how you fixed it?

@quintendewilde
Copy link

Hi!

Is it possible to make this "live" such as everytime a tweet is posted with a #cat it fetches it? Or that the code just looks a the last minute every other minute?

I hope this is clear?

@Stellaxu19
Copy link

Can anyone help me find a way to obtain these data for longer period of time? such as using gotoldtweets3? Thank you very much.

@ojkoort
Copy link

ojkoort commented Jul 14, 2020

just signed up to github to say thank you

@vedantkalan777
Copy link

Hello Thank You for the above code, I wanted to download data by hashtag and the location. For example - tweets with "#coronavirus" and the locations of the tweets "India" and the tweets time limit from "2020-04-01" till "2020-04-31", can anyone please help me for that, It is really important for my research.

@Harish-uthravalli
Copy link

Harish-uthravalli commented Aug 25, 2020

How to retrieve the name of the user who tweeted?

@AmbiTyga
Copy link

How can I get tweets containing only keywords I mentioned independent of position, like I tried stream.filter(track=["good","bank","@NYC"]), I used these keywords in the code but sometimes I am getting tweets containing either 1,2 or 3 of them together. I want to get tweets containing 3 of these keywords, independent of the position.

@mandarthosar
Copy link

I am a novice here. The code didn't work for me first. However, it worked after I added one line to the end - csvFile.close()
Not sure if I am the only one who faced it or it is such a trivial problem that all others have fixed it already in their own codes.

@eshraongithub
Copy link

eshraongithub commented Nov 30, 2020

This code works. Thank you!
How can I increase the character limit that is currently forced on the tweets extracted?

Thank you!

@McHarold404
Copy link

What is the max value to which I can set the count to?

@mindyng
Copy link

mindyng commented Jan 5, 2021

Thanks! This is short and sweet. Worked for me using: Python 3.7.6

also, for those wondering how to get output into a Pandas DF, I did this (added two lists):

created_at = []
text = []
for tweet in tweepy.Cursor(api.search,q=["#covidnurse"],count=100,
                           lang="en",
                           since="2020-03-15").items():
    print (tweet.created_at, tweet.text)
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])
    created_at.append(tweet.created_at)
    text.append(tweet.text)

df =  pd.DataFrame(
    {'created_at': created_at,
     'tweet': text
    })
df #RT (Retweets) are duplicates since Twitter is an echo chamber; will need to get rid of duplicates

df_clean = df[~df.tweet.str.contains("RT")]
df_clean

@lcontini
Copy link

lcontini commented Feb 5, 2021

Great work!!
I have a question- will it return any response if there is one only tweet which is dated Feb 19, 2015? For hashtag- LexusEnformRemote
I have read Tweepy only return tweets for the past 7 days.
One more thing, how to search for multiple hashtags in a single loop?

Answering my own question of getting tweets beyond the past 7 days. I used GetOldTweets3 library by @Jefferson-Henrique and it's working fine for me. We can get tweets by multiple user accounts in one go but search doesn't take two hashtags.
Link

@abhishek-negi to search for multiple hashtags you could perform a loop from a list.

Something like this:

list_tweets_all_hashtags = []
hashtags_list = #tag1, #tag2, #tagN

for hashtag in hashtags_list:
    list_by_tag = [for tweet in tweepy.Cursor(api.search, q=hashtag, result_type="recent").items(100)]
    list_tweets_all_hashtags.extend(list_by_tag)
    
print(list_tweets_all_hashtags)

@ashex08
Copy link

ashex08 commented Feb 17, 2021

I am getting the below error at this line: csvFile = open('askgretawhy.csv', 'a')

Traceback (most recent call last):
File "", line 1, in
PermissionError: [Errno 13] Permission denied: 'askgretawhy.csv'

Please help me understand where is the issue.

@lcontini
Copy link

I am getting the below error at this line: csvFile = open('askgretawhy.csv', 'a')

Traceback (most recent call last):
File "", line 1, in
PermissionError: [Errno 13] Permission denied: 'askgretawhy.csv'

Please help me understand where is the issue.

It’s most likely you don’t have the appropriate permissions to write/overwrite file askgretawhy.csv or to write on the current folder where the program runs.

If the file already exists, I suggest to run chmod 777 askgretawhy.csv and try again. If not, try giving less strict permissions to the current folder, if you can.

@ashex08
Copy link

ashex08 commented Feb 17, 2021

I am getting the below error at this line: csvFile = open('askgretawhy.csv', 'a')
Traceback (most recent call last):
File "", line 1, in
PermissionError: [Errno 13] Permission denied: 'askgretawhy.csv'
Please help me understand where is the issue.

It’s most likely you don’t have the appropriate permissions to write/overwrite file askgretawhy.csv or to write on the current folder where the program runs.

If the file already exists, I suggest to run chmod 777 askgretawhy.csv and try again. If not, try giving less strict permissions to the current folder, if you can.

How do I change the permissions? I am working on basic Python editor and do not know where can I create the file for Python to read it.

@riiyasharmaa
Copy link

File "C:/Users/hp/Documents/1.py", line 5
consumer_key ='249QYJU8UqoS364VqIOoWmpJQ'
^
IndentationError: unexpected indent

Getting this error please help
ASAP!

@mindyng
Copy link

mindyng commented May 21, 2021

@riiyasharmaa you might need to unindent (tab to the left)

@fakemaria
Copy link

Many thanks for the crawler mate, I was looking for it and didn´t have time to spend on it.

cheers

@Nathalie91
Copy link

I get This error from Python :
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 22-23: truncated \UXXXXXXXX escape
Any help please !!

@Nathalie91
Copy link

Hello,
how we can add the geolocalisation like : France to this script and more filters in the search ???

@coder-nooby
Copy link

AttributeError Traceback (most recent call last)
C:\Users\ADMINI~1\AppData\Local\Temp/ipykernel_11192/3895188609.py in
19 #Use csv Writer
20 csvWriter = csv.writer(csvFile)
---> 21 for tweet in tweepy.Cursor(api.search, q="#unitedAIRLINES",count=100, lang="en", since="2021-09-14").items():
22 print (tweet.created_at, tweet.text)
23 csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])

AttributeError: 'API' object has no attribute 'search'

I am using the latest version of Anaconda Navigator 2.10 and jupyter Notebook 6.4.4.

I have regenerated API tokens. How do should I fix this?

@ansegura7
Copy link

AttributeError Traceback (most recent call last) C:\Users\ADMINI~1\AppData\Local\Temp/ipykernel_11192/3895188609.py in 19 #Use csv Writer 20 csvWriter = csv.writer(csvFile) ---> 21 for tweet in tweepy.Cursor(api.search, q="#unitedAIRLINES",count=100, lang="en", since="2021-09-14").items(): 22 print (tweet.created_at, tweet.text) 23 csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])

AttributeError: 'API' object has no attribute 'search'

I am using the latest version of Anaconda Navigator 2.10 and jupyter Notebook 6.4.4.

I have regenerated API tokens. How do should I fix this?

I am getting the exact error reported above.

I had to upgrade my version of Anaconda and then installed tweepy version 4.1.0 and from there the following command is not working for me:

tweepy.Cursor(api.search, q=ht, since=since_date_str).items()

Any idea how to solve it? do I have to go back to a previous version of tweepy? if so, which version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment