Skip to content

Instantly share code, notes, and snippets.

@tonyseek
Last active December 17, 2015 23:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tonyseek/5687147 to your computer and use it in GitHub Desktop.
Save tonyseek/5687147 to your computer and use it in GitHub Desktop.
Declare charset of XHTML file.
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from sys import argv, stderr
from lxml import etree
xhtml_namespace = "http://www.w3.org/1999/xhtml"
content_type_attrs = {
"http-equiv": "Content-Type",
"content": "application/xhtml+xml; charset=utf-8"
}
def xhtml_declare_charset(filename):
print "* %s" % filename
parser = etree.XMLParser(remove_blank_text=True)
html_tag = etree.parse(filename, parser)
head_tag = html_tag.find("{%s}head" % xhtml_namespace)
meta_tag = etree.Element("{%s}meta" % xhtml_namespace, content_type_attrs)
head_tag.append(meta_tag)
html_tag.write(filename, pretty_print=True, encoding="utf-8")
if __name__ == "__main__":
if len(argv) < 2:
print >> stderr, "usage: %s [filename_1] [filename_2]\n" % argv[0]
for filename in argv[1:]:
xhtml_declare_charset(filename)
#!/usr/bin/env sh
find . -type f -name "*.xhtml" -exec python charsets.py {} +
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment