Skip to content

Instantly share code, notes, and snippets.

@lobstrio
Last active January 8, 2021 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lobstrio/27a997c2e234be21d391095c9a529ff6 to your computer and use it in GitHub Desktop.
Save lobstrio/27a997c2e234be21d391095c9a529ff6 to your computer and use it in GitHub Desktop.
Really simple Web Scraping Python Script for the first Tweets of Donald Trump using Requests, and lxml
#!/usr/bin/python3
# coding: utf-8
import requests
from lxml import html
def extract():
"""
Export all Tweets from @realDonaldTrump
"""
# initialisation
r = requests.session()
# collecte du code source
response = r.get(url='https://twitter.com/realDonaldTrump')
# parsing de la page
page = html.fromstring(response.text)
tweets = page.xpath("//li[contains(@class, 'js-stream-item stream-item stream-item')]")
for tweet in tweets:
text = tweet.xpath(".//p[contains(@class, 'TweetTextSize TweetTextSize--normal js-tweet-text tweet-text')]/text()")
date = tweet.xpath(".//small[@class='time']/a/@title")
if text:
print('En date du: {}'.format(date[0]))
print('Texte: {}'.format(text[0]))
print('\n')
# on lance la fonction
extract()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment