Skip to content

Instantly share code, notes, and snippets.

@spyros12
spyros12 / crawler.py
Created August 30, 2020 15:19 — forked from AO8/crawler.py
Crawl a website and gather all internal links with Python and BeautifulSoup.
# Adapted from example in Ch.3 of "Web Scraping With Python, Second Edition" by Ryan Mitchell
import re
import requests
from bs4 import BeautifulSoup
pages = set()
def get_links(page_url):
global pages
@spyros12
spyros12 / crawler.py
Created August 30, 2020 15:18 — forked from AO8/crawler.py
Crawl a website and gather all internal links with Python and BeautifulSoup.
# Adapted from example in Ch.3 of "Web Scraping With Python, Second Edition" by Ryan Mitchell
import re
import requests
from bs4 import BeautifulSoup
pages = set()
def get_links(page_url):
global pages