Skip to content

Instantly share code, notes, and snippets.

@bradyjiang
Created July 21, 2019 15:52
Show Gist options
  • Save bradyjiang/ed2a09383d0aa130cbb39e53c8ba7b55 to your computer and use it in GitHub Desktop.
Save bradyjiang/ed2a09383d0aa130cbb39e53c8ba7b55 to your computer and use it in GitHub Desktop.
Solution 2: BeautifulSoup
from bs4 import BeautifulSoup
# 20190720, from: https://stackoverflow.com/questions/30565404/remove-all-style-scripts-and-html-tags-from-an-html-pagesoup = BeautifulSoup(str_html)
for s in soup(["head"]):
s.decompose()
cleaned_html = str(soup)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment