Skip to content

Instantly share code, notes, and snippets.

@ebergam
Created January 9, 2021 12:13
Show Gist options
  • Save ebergam/e3ac4b4b4760541d711ee95d13c78f13 to your computer and use it in GitHub Desktop.
Save ebergam/e3ac4b4b4760541d711ee95d13c78f13 to your computer and use it in GitHub Desktop.
Indexer for ECB press releases
from bs4 import BeautifulSoup as bs
import re, requests, time
from lxml import html
links = []
for date in range(1998, 2020+1):
url = 'https://www.ecb.europa.eu/press/pressconf/{}/html/index_include.en.html'.format(date)
r = requests.get(url)
tree = html.fromstring(r.content)
links_y = tree.xpath("//a/span[contains(text(),'English')]/../@href")
links.extend(links_y)
print(str(date) + ": found {} links".format(len(links)))
print("In total {} links found".format(len(links)))
print("Economic Emergency solved!!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment