Last active
January 26, 2016 23:53
-
-
Save dufferzafar/36088f701bba507fd8f4 to your computer and use it in GitHub Desktop.
Timing execution of BeautifulSoup's prettify and lxml.tostring()
I have used bs4 when I started out scraping and quickly switched to lxml, there is nothing better than lxml l. I just don't like how hard is to install, that's my only complain from lxml.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Beautifulsoup uses html.parser by default.
It is possible to use lxml as @Justin42 said.
http://www.crummy.com/software/BeautifulSoup/bs4/doc/#you-need-a-parser
I believe it is regarded by most people writing scrapers in python that lxml is the fastest parser.