Skip to content

Instantly share code, notes, and snippets.

@bradmontgomery
Created January 16, 2011 02:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bradmontgomery/781482 to your computer and use it in GitHub Desktop.
Save bradmontgomery/781482 to your computer and use it in GitHub Desktop.
Playing with bleach, and not sure if I'm doing it wrong.
>>> from bleach import Bleach
>>> from janitor import html_tags # my stuff
>>>
>>> html_tags.basic_document_tags
['a', 'abbr', 'acronym', 'blockquote', 'cite', 'code', 'dd', 'del', 'dfn', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'img', 'ins', 'kbd', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'strong', 'ul', 'body', 'DOCTYPE', 'head', 'html', 'link', 'meta', 'style', 'title']
>>>
>>> s = '<!DOCTYPE html><html><head><title>t</title></head><body><p><em>Some</em> stuff</p></body></html>'
>>>
>>> bl = Bleach()
>>> bl.clean(s, tags=html_tags.basic_document_tags)
u'<title>t</title><p><em>Some</em> stuff</p>'
>>>
>>> bl.clean(s, tags=['html', 'head', 'title', 'body', 'p'])
u'<title>t</title><p>&lt;em&gt;Some&lt;/em&gt; stuff</p>'
>>> bl.clean(s, tags=['doctype', 'html', 'head', 'title', 'body', 'p'])
u'<title>t</title><p>&lt;em&gt;Some&lt;/em&gt; stuff</p>'
>>> bl.clean(s, tags=['p'])
u'&lt;html&gt;&lt;head&gt;&lt;title&gt;t&lt;/title&gt;&lt;/head&gt;&lt;body&gt;<p>&lt;em&gt;Some&lt;/em&gt; stuff</p>&lt;/body&gt;&lt;/html&gt;'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment