Skip to content

Instantly share code, notes, and snippets.

@yoander
Last active January 9, 2020 03:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save yoander/ae488ac7465a9e77f6f63b0c61e58f5c to your computer and use it in GitHub Desktop.
Save yoander/ae488ac7465a9e77f6f63b0c61e58f5c to your computer and use it in GitHub Desktop.
Python script to pretty print XML files
#!/usr/bin/python
import os
import re
import HTMLParser as parser
import xml.dom.minidom as minidom
import sys
try:
# Read de file name from standard input
filename = sys.argv[1]
if os.path.isfile(filename) and os.access(filename, os.R_OK):
# Open the file in read only mode
file = open(filename, 'r')
# Read the file and decode html entities
xml = parser.HTMLParser().unescape(file.read())
# Pretify the xml
xml = minidom.parseString(xml).toprettyxml()
# Handle issue with CDATA section due minidom add extraspace
# before/after CDATA
xml = re.sub('>\s+<!', '><!', xml)
xml = re.sub(']>\s+<', ']><', xml)
# Remove empty lines
# Thanks to http://stackoverflow.com/questions/1140958/whats-a-quick-one-liner-to-remove-empty-lines-from-a-python-string
print "".join([s for s in xml.strip().splitlines(True) if s.strip()])
else:
print "File is missing or is not readable!"
except IndexError:
print "You must specify a file name!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment