-
-
Save sente/1083506 to your computer and use it in GitHub Desktop.
function formatXml(xml) { | |
var formatted = ''; | |
var reg = /(>)(<)(\/*)/g; | |
xml = xml.replace(reg, '$1\r\n$2$3'); | |
var pad = 0; | |
jQuery.each(xml.split('\r\n'), function(index, node) { | |
var indent = 0; | |
if (node.match( /.+<\/\w[^>]*>$/ )) { | |
indent = 0; | |
} else if (node.match( /^<\/\w/ )) { | |
if (pad != 0) { | |
pad -= 1; | |
} | |
} else if (node.match( /^<\w[^>]*[^\/]>.*$/ )) { | |
indent = 1; | |
} else { | |
indent = 0; | |
} | |
var padding = ''; | |
for (var i = 0; i < pad; i++) { | |
padding += ' '; | |
} | |
formatted += padding + node + '\r\n'; | |
pad += indent; | |
}); | |
return formatted; | |
} | |
xml_raw = '<foo><bar><baz>blahblah</baz><baz>tralala</baz></bar></foo>'; | |
xml_formatted = formatXml(xml_raw); | |
xml_escaped = xml_formatted.replace(/&/g,'&').replace(/</g,'<').replace(/>/g,'>').replace(/ /g, ' ').replace(/\n/g,'<br />'); | |
var mydiv = document.createElement('div'); | |
mydiv.innerHTML = xml_escaped; | |
document.body.appendChild(mydiv); | |
same here :) thanks, sente!
xml.replace -> should probably be xml.toString().replace(...)
Otherwise you get an error if you accidentally pass an object rather than a string.
@TotallyInformation - hmm, that's a good point!
xml pretty print javascript <- google search rank 3 today.
Thanks a lot for sharing. Very helpful
xml pretty print javascript --- first position rank on google today. ;)
Thank you for a very useful piece of code. Here is my fork without the jQuery dependency (better for Node.js): https://gist.github.com/kurtsson/3f1c8efc0ccd549c9e31
Just a note, xml formatting would fail, if there are whites paces between end and start tags of the input XML string. As a fix, replace var reg = /(>)(<)(/)/g; with var reg = /(>)\s(<)(/*)/g; in the script.
I've just come across a small issue with CDATA sections. If these contain tags then this puts then on new lines introducing white space which can break the content of the CDATA section where white space may already be deliberately encoded.
eg. Here the CDATA section is protecting carriage returns contained in the string:
<!{CDATA[line 1
line 2
line 3
]]>
But after "pretty printing" it ends up as:
line 2 line 3 ]]>which has changed the meaning of the text the CDATA section was enclosing.
Cool function in all other cases though! :-)
Thank you. Works perfectly!
node.match(/^<\w[^>][^\/]>.$/) seems not to match single letter elements like
Thank u so much .. It helps a lot
@Risord Indeed, that regular expression should be /^<\w([^>]*[^\/])?>.*$/
Thanks for the code.
Here's an adapted ES6 variant:
formatXml(xml) {
const PADDING = ' '.repeat(2); // set desired indent size here
const reg = /(>)(<)(\/*)/g;
let pad = 0;
xml = xml.replace(reg, '$1\r\n$2$3');
return xml.split('\r\n').map((node, index) => {
let indent = 0;
if (node.match(/.+<\/\w[^>]*>$/)) {
indent = 0;
} else if (node.match(/^<\/\w/) && pad > 0) {
pad -= 1;
} else if (node.match(/^<\w[^>]*[^\/]>.*$/)) {
indent = 1;
} else {
indent = 0;
}
pad += indent;
return PADDING.repeat(pad - indent) + node;
}).join('\r\n');
}
ES6 variant works for me. Thanks!
@grubersjoe
js lint fails with this message:
Unnecessary escape character: \/ no-useless-escape
so I used [^/]
in last node.match
function
Many Thanks,, works perfect!!
did you try script on bad formed xml like
<feed xmlns="http://www.w3.org/2005/Atom"
need="urgent"> <title type="general" sub="sample">Example Feed</title>
<subtitle><![CDATA[minor subtitle <<< ]]></subtitle>
<author nn="1" order="2" case="now"><name>John Doe</name><email>johndoe@example.com</email>
</author> <entry><title>Atom-Powered Robots Run Amok</title>
<link href="http://example.org/2003/12/13/atom03" />
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated><summary>Some text.</summary>
</entry> </feed>
? it does not work for me
@grubersjoe
Many thanks!
@ sente
Wow, so cool, thanks!
<simple-value><![CDATA[部门
Dept]]>
after format xml, it will be follow result, have many space char before <![CDATA
i had try, modify ,
var reg = /(>)(<)([^!])(/*)/g;
thanks \o
Good work - although I must point out that it doesn't touch existing whitespace indentation, with hilarious consequences.
Thanks for the code.
Here's an adapted ES6 variant:
formatXml(xml) { const PADDING = ' '.repeat(2); // set desired indent size here const reg = /(>)(<)(\/*)/g; let pad = 0; xml = xml.replace(reg, '$1\r\n$2$3'); return xml.split('\r\n').map((node, index) => { let indent = 0; if (node.match(/.+<\/\w[^>]*>$/)) { indent = 0; } else if (node.match(/^<\/\w/) && pad > 0) { pad -= 1; } else if (node.match(/^<\w[^>]*[^\/]>.*$/)) { indent = 1; } else { indent = 0; } pad += indent; return PADDING.repeat(pad - indent) + node; }).join('\r\n'); }
Thanks for the ES6 version @grubersjoe ,a small addition though for handling newlines and existing whitespace between tags(as pointed out by @brennanyoung):
formatXml(xml) {
// Remove all the newlines and then remove all the spaces between tags
xml = xml.replace(/(\r\n|\n|\r)/gm, " ").replace(/>\s+</g,'><');
const PADDING = ' '.repeat(4); // set desired indent size here
const reg = /(>)(<)(\/*)/g;
let pad = 0;
xml = xml.replace(reg, '$1\r\n$2$3');
return xml.split('\r\n').map((node, index) => {
let indent = 0;
if (node.match(/.+<\/\w[^>]*>$/)) {
indent = 0;
} else if (node.match(/^<\/\w/) && pad > 0) {
pad -= 1;
} else if (node.match(/^<\w[^>]*[^\/]>.*$/)) {
indent = 1;
} else {
indent = 0;
}
pad += indent;
return PADDING.repeat(pad - indent) + node;
}).join('\r\n');
}
I'm not sure if the latest versions of FF added more constraints, I'm getting "Toot much recursion" from formatXml()
A slight mismatch in one of the regular expressions needs to be: node.match(/^<[\w^>]*[^\/]>.*$/))
. Merging everyone's updates, here's what I've been using:
/**
* Prettifies the tab indentation of given XML string
*
* @param xml string - The XML to clean
* @return string - Prettified XML string
*/
static prettify(xml: String) {
let pad = 0;
const padding = '\u0020'.repeat(4); // set desired indent size here
xml = xml.replace(/(\r\n|\n|\r)/gm, '\u0020').replace(/>\s+</g,'><');
xml = xml.replace(/(>)(<)(\/*)/g, '$1\r\n$2$3');
return xml.split('\r\n').map((node, index) => { //XML elements now split into lines
let indent = 0;
if (node.match(/.+<\/\w[^>]*>$/)) {
indent = 0;
} else if (node.match(/^<\/\w/) && pad > 0) {
pad -= 1;
} else if (node.match(/^<[\w^>]*[^\/]>.*$/)) {
indent = 1;
} else {
indent = 0;
}
pad += indent;
return padding.repeat(pad - indent) + node;
}).join('\r\n');
}
jlyk: if (node.match(/.+<\/\w[^>]*>$/)) {
- this is extremely slow on big rows.
not for production use for sure ;) - DannyDainton/newman-reporter-htmlextra#428 (comment)
Google Search: javascript xml pretty print
I have it on last position of first page :)