Skip to content

Instantly share code, notes, and snippets.

@gshmu
Last active May 25, 2017 05:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gshmu/da271d0c38416f366cd29725c29ea0fc to your computer and use it in GitHub Desktop.
Save gshmu/da271d0c38416f366cd29725c29ea0fc to your computer and use it in GitHub Desktop.
python: bs4 html.parser self-close tag error
from bs4 import BeautifulSoup
html_str = "<html><body><p>tag-error<img src='a'><img src='b'><img src='c'></p></body></html>"
print(BeautifulSoup(html_str, 'html.parser').prettify())
__out__ = """
<html>
<body>
<p>
tag-error
<img src="a">
<img src="b">
<img src="c"/>
</img>
</img>
</p>
</body>
</html>
"""
@gshmu
Copy link
Author

gshmu commented May 25, 2017

expect to happen:
<img src='a' /><img src='b' /><img src='c' />

actually:
<img src="a"><img src="b"><img src="c"/></img></img>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment