What is scraping?
Every webpage contains data. Sometimes, this data proves useful in a context other than the user interface of the webpage itself. Happily, with the right tools, we can extract data from any webpage. Indeed, we might think of a webpage as a “really bad” API (credit: Forest Gregg), with which we can interact and collect information.
At DataMade, we use the lxml library to access webpages. The lxml library parses and processes HTML. It’s well-documented, expansive, and popular.