Skip to content

Instantly share code, notes, and snippets.

@scrapehero
Last active June 2, 2021 07:02
Show Gist options
  • Save scrapehero/d0305d8d15b0e447dcefdf548a9846e9 to your computer and use it in GitHub Desktop.
Save scrapehero/d0305d8d15b0e447dcefdf548a9846e9 to your computer and use it in GitHub Desktop.
How to scrape Historical Tweet Data from Twitter using Web Scraper Extension
{
"_id":"twitter_feed",
"startUrl":[
"https://twitter.com/search?l=&q=web%20scraping%20since%3A2018-10-01%20until%3A2018-10-05&src=typd&lang=en"
],
"selectors":[
{
"id":"tweet",
"type":"SelectorElementScroll",
"parentSelectors":[
"_root"
],
"selector":"div.tweet",
"multiple":true,
"delay":"2000"
},
{
"id":"handle",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":"span.username",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"name",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":"strong.fullname",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"content",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":".tweet-text",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"replies",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":"div.ProfileTweet-action.ProfileTweet-action--reply span.ProfileTweet-actionCountForPresentation",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"retweets",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":"div.ProfileTweet-action.ProfileTweet-action--retweet button.ProfileTweet-actionButton span.ProfileTweet-actionCountForPresentation",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"favorites",
"type":"SelectorText",
"parentSelectors":[
"tweet"
],
"selector":"div.ProfileTweet-action.ProfileTweet-action--favorite button.ProfileTweet-actionButton span.ProfileTweet-actionCountForPresentation",
"multiple":false,
"regex":"",
"delay":0
},
{
"id":"unix_timestamp",
"type":"SelectorElementAttribute",
"parentSelectors":[
"tweet"
],
"selector":"span._timestamp",
"multiple":false,
"extractAttribute":"data-time-ms",
"delay":0
},
{
"id":"published_date",
"type":"SelectorElementAttribute",
"parentSelectors":[
"tweet"
],
"selector":".time a.tweet-timestamp",
"multiple":false,
"extractAttribute":"title",
"delay":0
},
{
"id":"url",
"type":"SelectorElementAttribute",
"parentSelectors":[
"tweet"
],
"selector":"a.tweet-timestamp",
"multiple":false,
"extractAttribute":"href",
"delay":0
}
]
}
@JagosRadovic
Copy link

doesn't work with the new twitter layout, please update! thanks

@adit-negi
Copy link

Could you please update this.

@silviamosquera
Copy link

it's not working, could you please update? thanks!

@afghanpasha
Copy link

Hi,

It's not working, please update the code.

Thanks

@lsciutto
Copy link

Hello. I have tried the code and it is not working. A scrape file is not been generated.

@codeBlooded1997
Copy link

Hi,

It is not working, please updarte it.

Thanks.

@serdarselimys
Copy link

Has anybody found an updated version??, if so can you guys please share

@afghanpasha
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment