Skip to content

Instantly share code, notes, and snippets.

@stupidbodo
Last active August 29, 2015 14:03
Show Gist options
  • Save stupidbodo/2055daf9d464ddc6f2d2 to your computer and use it in GitHub Desktop.
Save stupidbodo/2055daf9d464ddc6f2d2 to your computer and use it in GitHub Desktop.
Formatting HTML Strings in Python
# 2 options in formatting html strings in python
# 1) You don't care about excess whitespace - Use triple quotes
# 2) You want to get rid of excess whitespace - Use standard escape quotes method
##############################
# Triple quotes method
##############################
url = "http://example.com"
width = "100"
height = "100"
sample = """
<a href="{0}"><img src="http://example.com/img/1.jpg" width="{1}" height="{2}"
style="border-style: none"/></a>
<img src="http://example/pixel/" height="1" width="1"/>
""".format(url, width, height)
# Using triple quotes will give you unwanted indentation. To align your indentation,
# you can use textwrap module from pythons stdlib
import textwrap
sample = textwrap.dedent(sample)
# When you print the snippet, you will notice that there is still whitespaces between attributes "height" and "style"
# textwrap.dedent only helps you to align your identation, it does not get rid of that excess whitespace between
# attribute "height" and "style". If you want to get rid of that whitespace, use standard escape quotes method.
print sample
# Output
# <a href="http://example.com"><img src="http://example.com/img/1.jpg" width="100" height="100"
# style="border-style: none"/></a>
# <img src="http://example/pixel/" height="1" width="1"/>
# Now if you need the snippet in actual html, you can stop here.
# If you want the snippet to be in escape double quotes, proceed to next step.
# Use json.dumps to escape double quotes so that you can use the snippet without relying on triple quotes
import json
sample = json.dumps(sample)
print sample
# "\n<a href=\"http://example.com\"><img src=\"http://example.com/img/1.jpg\" width=\"100\" height=\"100\"\nstyle=\"border-style: none\"/></a>\n<img src=\"http://example/pixel/\" height=\"1\" width=\"1\"/>\n"
##############################
# Standard Escape Quotes Method
##############################
# Normally when you are formatting html strings, you want to get rid of all the unwanted whitespace
# Hence, Standard Escape Quotes Method is a preferred method when you are formatting html strings
url = "http://example.com"
width = "100"
height = "100"
sample = "<a href=\"{0}\"><img src=\"http://example.com/img/1.jpg\" width=\"{1}\" height=\"{2}\""\
"style=\"border-style: none\"/></a><img src=\"http://example/pixel/\" height=\"1\" "\
"width=\"1\"/>".format(url, width, height)
print sample
# Notice that everything is in a line without any excess whitespace
# <a href="http://example.com"><img src="http://example.com/img/1.jpg" width="100" height="100"style="border-style: none"/></a><img src="http://example/pixel/" height="1" width="1"/>
# The problem with standard escape quotes method is how do you escape all the quotes from a raw string?
# Solution 1 - Do it manually
# Solution 2 - Use online tool like http://bernhardhaeussner.de/odd/json-escape/
# Solution 3 - Use json.dumps
# Using json.dumps to escape double quotes
# We are using triple quotes here to do a temporary escape and store the string inside variable
# Keep the string in one line so you are not introducing any unwanted whitespace
import json
raw = """<a href="{0}"><img src="http://example.com/img/1.jpg" width="{1}" height="{2}" style="border-style: none"/></a><img src="http://example/pixel/" height="1" width="1"/>"""
raw = raw.format(url, width, height)
sample = json.dumps(raw)
print sample
# "<a href=\"http://example.com\"><img src=\"http://example.com/img/1.jpg\" width=\"100\" height=\"100\" style=\"border-style: none\"/></a><img src=\"http://example/pixel/\" height=\"1\" width=\"1\"/>"
# If you find yourself doing this many times, consider using a templating engine like jinja2
##############################
# Other formatting stuffs
##############################
# To test the escape html string snippet, you can use document.write
# Add this to some.html page
<script >
document.write("PUT_SAMPLE_SNIPPET_HERE")
< / script >
# Unescape string
string_to_unescape = "\n<a href=\"http://example.com\">"
print string_to_unescape.decode("string-escape")
# <a href="http://example.com">
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment