Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
DOMParser HTML extension - Now a polyfill since HTML parsing was added to the DOMParser specification
/*
* DOMParser HTML extension
* 2012-02-02
*
* By Eli Grey, http://eligrey.com
* Public domain.
* NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK.
*/
/*global document, DOMParser*/
(function(DOMParser) {
"use strict";
var
DOMParser_proto = DOMParser.prototype
, real_parseFromString = DOMParser_proto.parseFromString
;
// Firefox/Opera/IE throw errors on unsupported types
try {
// WebKit returns null on unsupported types
if ((new DOMParser).parseFromString("", "text/html")) {
// text/html parsing is natively supported
return;
}
} catch (ex) {}
DOMParser_proto.parseFromString = function(markup, type) {
if (/^\s*text\/html\s*(?:;|$)/i.test(type)) {
var
doc = document.implementation.createHTMLDocument("")
, doc_elt = doc.documentElement
, first_elt
;
doc_elt.innerHTML = markup;
first_elt = doc_elt.firstElementChild;
if ( // are we dealing with an entire document or a fragment?
doc_elt.childElementCount === 1
&& first_elt.localName.toLowerCase() === "html"
) {
doc.replaceChild(first_elt, doc_elt);
}
return doc;
} else {
return real_parseFromString.apply(this, arguments);
}
};
}(DOMParser));
@hsivonen

This comment has been minimized.

Show comment Hide comment
@hsivonen

hsivonen Nov 1, 2011

This doesn't work correctly, it markup contains external scripts.

hsivonen commented Nov 1, 2011

This doesn't work correctly, it markup contains external scripts.

@eligrey

This comment has been minimized.

Show comment Hide comment
@eligrey

eligrey Nov 1, 2011

I cannot reproduce your problem. I used (new DOMParser).parseFromString("<script src='http://foo/bar.js'></script>", "text/html").querySelector("script").src === "http://foo/bar.js". Do you mean that the script isn't executed? DOMParser is for parsing HTML, not executing it. Create an iframe and manipulate its content document after appending into the current document if you wish to create an active document.

Owner

eligrey commented Nov 1, 2011

I cannot reproduce your problem. I used (new DOMParser).parseFromString("<script src='http://foo/bar.js'></script>", "text/html").querySelector("script").src === "http://foo/bar.js". Do you mean that the script isn't executed? DOMParser is for parsing HTML, not executing it. Create an iframe and manipulate its content document after appending into the current document if you wish to create an active document.

@marijn

This comment has been minimized.

Show comment Hide comment
@marijn

marijn Feb 3, 2012

@eligrey, how would this stack up to the html parser by @jeresig?

marijn commented Feb 3, 2012

@eligrey, how would this stack up to the html parser by @jeresig?

@eligrey

This comment has been minimized.

Show comment Hide comment
@eligrey

eligrey Feb 3, 2012

It will always be faster than @jeresig's parser as it uses the browser's native HTML5 parser.

Owner

eligrey commented Feb 3, 2012

It will always be faster than @jeresig's parser as it uses the browser's native HTML5 parser.

@marijn

This comment has been minimized.

Show comment Hide comment
@marijn

marijn Feb 4, 2012

marijn commented Feb 4, 2012

@eligrey

This comment has been minimized.

Show comment Hide comment
@eligrey

eligrey Feb 4, 2012

Every browser that supports document.implementation.createHTMLDocument should work. I think IE <8 might not support that. A workaround for IE <8 could be to use an iframe, but that creates an active document context, which is dangerous and should only be used for parsing trusted HTML.

In short, all current browsers support it.

Owner

eligrey commented Feb 4, 2012

Every browser that supports document.implementation.createHTMLDocument should work. I think IE <8 might not support that. A workaround for IE <8 could be to use an iframe, but that creates an active document context, which is dangerous and should only be used for parsing trusted HTML.

In short, all current browsers support it.

@RobertXGreen

This comment has been minimized.

Show comment Hide comment
@RobertXGreen

RobertXGreen Aug 29, 2012

This doesn't work in IE 9, it fails at:

 `doc_elt.innerHTML = markup;` 

with the message "Error: Invalid target element for this operation."

This doesn't work in IE 9, it fails at:

 `doc_elt.innerHTML = markup;` 

with the message "Error: Invalid target element for this operation."

@RobertXGreen

This comment has been minimized.

Show comment Hide comment
@RobertXGreen

RobertXGreen Aug 29, 2012

Oh, forgot to mention that according to MSDN, theinnerHTML property for the following elements is read–only: col, colGroup, frameSet, html, head, style, table, tBody, tFoot, tHead, title, and tr.

See http://msdn.microsoft.com/en-us/library/ie/ms533897%28v=vs.85%29.aspx

Oh, forgot to mention that according to MSDN, theinnerHTML property for the following elements is read–only: col, colGroup, frameSet, html, head, style, table, tBody, tFoot, tHead, title, and tr.

See http://msdn.microsoft.com/en-us/library/ie/ms533897%28v=vs.85%29.aspx

@eligrey

This comment has been minimized.

Show comment Hide comment
@eligrey

eligrey Sep 4, 2012

Thanks for heads up @RobertXGreen; fixed.

Owner

eligrey commented Sep 4, 2012

Thanks for heads up @RobertXGreen; fixed.

@igstan

This comment has been minimized.

Show comment Hide comment
@igstan

igstan Sep 5, 2012

What's the reason you pass DOMParser as an argument to the anonymous function instead of just accessing it where you need it? Micro-optimization?

igstan commented Sep 5, 2012

What's the reason you pass DOMParser as an argument to the anonymous function instead of just accessing it where you need it? Micro-optimization?

@karger

This comment has been minimized.

Show comment Hide comment
@karger

karger Oct 3, 2012

doesn't work in IE9 because innerHTML is a read-only property (line 36 fails).

karger commented Oct 3, 2012

doesn't work in IE9 because innerHTML is a read-only property (line 36 fails).

@karger

This comment has been minimized.

Show comment Hide comment
@karger

karger Oct 5, 2012

Need to clarify my previous comment. You may be able to set doc.body.innerHTML, but that doesn't work if passed-in markup is an entire document ('....'). You might hope to fix this by setting innerHTML on doc.documentElement instead of doc.body, but IE (at least 9) doesn't let you do that.

karger commented Oct 5, 2012

Need to clarify my previous comment. You may be able to set doc.body.innerHTML, but that doesn't work if passed-in markup is an entire document ('....'). You might hope to fix this by setting innerHTML on doc.documentElement instead of doc.body, but IE (at least 9) doesn't let you do that.

@kethinov

This comment has been minimized.

Show comment Hide comment
@kethinov

kethinov Feb 12, 2013

This won't work correctly on document strings that contain a full document with a doctype, a head tag, a title tag, etc. Here's a gist based loosely on the suggestion from @karger to try out doc.documentElement instead: https://gist.github.com/kethinov/4760460

I also made another gist that takes a more targeted approach to just making sure the title tag's content makes it into the new document irrespective of whatever else may be in the head. It's pretty hacky and I don't see how it would be all that useful to anyone, but here it is just in case anyone wants to look it over: https://gist.github.com/kethinov/4760431

@eligrey, can you merge my changes from my first gist (https://gist.github.com/kethinov/4760460) into your version?

Or if you don't think my changes are a good idea, let me know. Comments/feedback are totally welcome. Anywho, thanks for this polyfill. I wish more browsers had full support for DOMParser.

This won't work correctly on document strings that contain a full document with a doctype, a head tag, a title tag, etc. Here's a gist based loosely on the suggestion from @karger to try out doc.documentElement instead: https://gist.github.com/kethinov/4760460

I also made another gist that takes a more targeted approach to just making sure the title tag's content makes it into the new document irrespective of whatever else may be in the head. It's pretty hacky and I don't see how it would be all that useful to anyone, but here it is just in case anyone wants to look it over: https://gist.github.com/kethinov/4760431

@eligrey, can you merge my changes from my first gist (https://gist.github.com/kethinov/4760460) into your version?

Or if you don't think my changes are a good idea, let me know. Comments/feedback are totally welcome. Anywho, thanks for this polyfill. I wish more browsers had full support for DOMParser.

@eligrey

This comment has been minimized.

Show comment Hide comment
@eligrey

eligrey Apr 18, 2013

@kethinov I merged in your changes.

Owner

eligrey commented Apr 18, 2013

@kethinov I merged in your changes.

@joseeight

This comment has been minimized.

Show comment Hide comment
@joseeight

joseeight Oct 24, 2013

This is nice, but attributes on the documentElement will not be available in the DOMParser since this uses 'doc.documentElement.innerHTML' and the documentElement is read only in a DOM Implementation. So the attributes would need to be added manually. It's an edge case, but just adding a note for anyone that might run into that.

This is nice, but attributes on the documentElement will not be available in the DOMParser since this uses 'doc.documentElement.innerHTML' and the documentElement is read only in a DOM Implementation. So the attributes would need to be added manually. It's an edge case, but just adding a note for anyone that might run into that.

@jslegers

This comment has been minimized.

Show comment Hide comment
@jslegers

jslegers Dec 9, 2013

Since this polyfill assumes DOMParser is defined, you should add the following wrapper for your code :

if (window.DOMParser !== undefined){
    [   your code  ]
}

That way, your polyfill will just be ignored in browsers that don't support DOMParser (which happens to include IE8) instead of generating an error.

jslegers commented Dec 9, 2013

Since this polyfill assumes DOMParser is defined, you should add the following wrapper for your code :

if (window.DOMParser !== undefined){
    [   your code  ]
}

That way, your polyfill will just be ignored in browsers that don't support DOMParser (which happens to include IE8) instead of generating an error.

@DrewML

This comment has been minimized.

Show comment Hide comment
@DrewML

DrewML Mar 14, 2014

@jslegers: The code excerpt below is solving for that. You can't just do a check against the window for "DOMParser" because Safari supports DOMParser for XML, just not HTML. This can't be determined any other way besides using a try/catch.

try {
        // WebKit returns null on unsupported types
        if ((new DOMParser).parseFromString("", "text/html")) {
            // text/html parsing is natively supported
            return;
        }
    } catch (ex) {}

DrewML commented Mar 14, 2014

@jslegers: The code excerpt below is solving for that. You can't just do a check against the window for "DOMParser" because Safari supports DOMParser for XML, just not HTML. This can't be determined any other way besides using a try/catch.

try {
        // WebKit returns null on unsupported types
        if ((new DOMParser).parseFromString("", "text/html")) {
            // text/html parsing is natively supported
            return;
        }
    } catch (ex) {}
@thynctank

This comment has been minimized.

Show comment Hide comment
@thynctank

thynctank Jul 24, 2015

@DrewML not trying to troll, but jslegers isn't talking about that at all, but the presence of DOMParser in the first place.

Only bothering to leave this as I had to read your comment and his a couple times before realizing what was going on.

(never even had to deal with DOMParser before this ridiculous bug that should be dealt with on the server side but isn't... but that's just me whinging)

@DrewML not trying to troll, but jslegers isn't talking about that at all, but the presence of DOMParser in the first place.

Only bothering to leave this as I had to read your comment and his a couple times before realizing what was going on.

(never even had to deal with DOMParser before this ridiculous bug that should be dealt with on the server side but isn't... but that's just me whinging)

@ryan-allen

This comment has been minimized.

Show comment Hide comment
@ryan-allen

ryan-allen Oct 29, 2015

Awesome, works for me on iOS pulling whole pages and parsing them with XMLHttpRequest :)

Awesome, works for me on iOS pulling whole pages and parsing them with XMLHttpRequest :)

@alirezahosseini1368

This comment has been minimized.

Show comment Hide comment
@alirezahosseini1368

alirezahosseini1368 Jan 20, 2016

I add this code In my scripts files and works fine but this code run several times , How can i prevent this ? maybe some codes simlar below codes

DOMParserFlag = false;
if (DOMParserFlag == false) {

function DOMParser() {
    "use strict";

..........

}
DOMParserFlag = true;

}

I add this code In my scripts files and works fine but this code run several times , How can i prevent this ? maybe some codes simlar below codes

DOMParserFlag = false;
if (DOMParserFlag == false) {

function DOMParser() {
    "use strict";

..........

}
DOMParserFlag = true;

}

@rouki124

This comment has been minimized.

Show comment Hide comment
@kube

This comment has been minimized.

Show comment Hide comment
@kube

kube Nov 3, 2017

Setting .innerHTML will load image ressources (even if element is not appended to DOM), which is not the case with DOMParser.

kube commented Nov 3, 2017

Setting .innerHTML will load image ressources (even if element is not appended to DOM), which is not the case with DOMParser.

@gustavom

This comment has been minimized.

Show comment Hide comment
@gustavom

gustavom Nov 8, 2017

I remove this

if (markup.toLowerCase().indexOf('<!doctype') > -1) {
        console.log(markup);
        console.log('inserindo o elemento doc \n\n');
        //doc.documentElement.innerHTML = markup;
        doc.body.innerHTML = markup;
      } else {
        doc.body.innerHTML = markup;
      }

and i changed for this

doc.body.innerHTML = markup;

and I got the result I needed.

gustavom commented Nov 8, 2017

I remove this

if (markup.toLowerCase().indexOf('<!doctype') > -1) {
        console.log(markup);
        console.log('inserindo o elemento doc \n\n');
        //doc.documentElement.innerHTML = markup;
        doc.body.innerHTML = markup;
      } else {
        doc.body.innerHTML = markup;
      }

and i changed for this

doc.body.innerHTML = markup;

and I got the result I needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment