Skip to content

Instantly share code, notes, and snippets.

@smallnewer
Last active December 23, 2015 04:08
Show Gist options
  • Save smallnewer/6577784 to your computer and use it in GitHub Desktop.
Save smallnewer/6577784 to your computer and use it in GitHub Desktop.
parseHTML。把HTML字符串解析为DOM节点。详见评论。
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<div id="out">
</div>
<script type="text/mvvm" id="test">
<td>
<input type="text" value="123" />
<link rel="stylesheet" href="index.css">
<img src="a.jpg" alt="">
<hr>
<hr></hr>
<span id='aaa'>123</span>
</td>
</script>
<textarea id="test2" cols="30" rows="10"></textarea>
<script>
var tagHooks = {
area: [1, "<map>"],
param: [1, "<object>"],
col: [2, "<table><tbody></tbody><colgroup>", "</table>"],
legend: [1, "<fieldset>"],
option: [1, "<select multiple='multiple'>"],
thead: [1, "<table>", "</table>"],
tr: [2, "<table><tbody>"],
td: [3, "<table><tbody><tr>"],
_default : [0,""]
};
// 可以通过parser后执行的script的MIME
var scriptTypes = {
"text/javascript" : 1,
"text/ecmascript" : 1,
"application/ecmascript" : 1,
"application/javascript" : 1,
"text/vbscript" : 1
}
var rtagName = /<([\w:]+)/;
function parseHTML (html) {
var domParser = document.createElement("div");
var tag = (rtagName.exec(html) || ["",""])[1].toLowerCase();
var wrap = tagHooks[tag] || tagHooks._default;
// 为了符合嵌套关系,手动增加外部嵌套。
// 问题参考:https://gist.github.com/smallnewer/6576720
domParser.innerHTML = wrap[1] + html + (wrap[2] || "");
//使用innerHTML生成的script节点不会发出请求与执行text属性
var els = domParser.getElementsByTagName("script");
if (els.length) {
var script = document.createElement("script"),
temp;
for (var i = 0,el ; el = els[i++]; ) {
temp = script.cloneNode(false);//FF不能省略参数
for (var j = 0, attr; attr = el.attributes[i++];) {
if (attr.specified) {
neo[attr.name] = attr.value;
};
};
neo.text = el.text;//必须指定,因为无法在attributes中遍历出来
el.parentNode.replaceChild(neo, el) //替换节点
};
};
//移除我们为了符合套嵌关系而添加的标签
for (i = wrap[0]; i--; domParser = domParser.lastChild) {
}
var firstChild;
var fragment = document.createDocumentFragment();
while(firstChild = domParser.firstChild){
fragment.appendChild(firstChild);
}
return fragment;
}
/**
* test example
*/
var test = document.getElementById('test');
var out = document.getElementById('out');
var html = test.innerHTML;
html = parseHTML(html);
out.appendChild(html); // td输出正常
</script>
</body>
</html>
@smallnewer
Copy link
Author

把HTML字符串解析为DOM节点。

原理利用了innerHTML。

利用innerHTML插入标签,存在两个明显的bug:

  1. innerHTML无法正确插入td、legend等嵌套标签的问题。参考https://gist.github.com/smallnewer/6576720。
  2. IE中无法正确插入样式表等noscope等问题。本例中暂时没有解决。

核心方案从https://github.com/RubyLouvre/avalon/blob/master/avalon.js 源代码摘过来的,并不完整。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment