This script will download web pages and spit out the title, url, and text in a separate file.
Usage:
- Create a file
articles
with newline separated list of URLs to download - Run
python extract.py
Prerequisite: newspaper is used for text extraction. Install: pip install newspaper