Skip to content

Instantly share code, notes, and snippets.

@edison12a
Forked from yoander/xmlpp.py
Created April 26, 2018 08:36
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edison12a/7247648fed8fa3c511b8edf55d91c029 to your computer and use it in GitHub Desktop.
Save edison12a/7247648fed8fa3c511b8edf55d91c029 to your computer and use it in GitHub Desktop.
Python script to pretty print XML files
#!/usr/bin/python
import os
import re
import HTMLParser as parser
import xml.dom.minidom as minidom
import sys
try:
# Read de file name from standard input
filename = sys.argv[1]
if os.path.isfile(filename) and os.access(filename, os.R_OK):
# Open the file in read only mode
file = open(filename, 'r')
# Read the file and decode html entities
xml = parser.HTMLParser().unescape(file.read())
# Pretify the xml
xml = minidom.parseString(xml).toprettyxml()
# Handle issue with CDATA section due minidom add extraspace
# before/after CDATA
xml = re.sub('>\s+<!', '><!', xml)
xml = re.sub(']>\s+<', ']><', xml)
# Remove empty lines
# Thanks to http://stackoverflow.com/questions/1140958/whats-a-quick-one-liner-to-remove-empty-lines-from-a-python-string
print "".join([s for s in xml.strip().splitlines(True) if s.strip()])
else:
print "File is missing or is not readable!"
except IndexError:
print "You must specify a file name!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment