Skip to content

Instantly share code, notes, and snippets.

View darkskygit's full-sized avatar
👀
seeking new challenges

DarkSky darkskygit

👀
seeking new challenges
View GitHub Profile
@darkskygit
darkskygit / cleanHtml.js
Last active September 5, 2025 14:59
cleanup html for ReaderLM
function cleanHtml(html) {
const parser = new DOMParser()
const doc = parser.parseFromString(html || document.querySelector('html').innerHTML, 'text/html')
const allowedAttrs = ['href', 'src', 'alt', 'title', 'aria-*']
const allowedTags = ['div', 'span', 'p', 'a', 'img', 'ul', 'ol', 'li', 'strong', 'em', 'u', 'b', 'i']
const disallowedTags = ['script', 'style', 'meta', 'link', 'iframe', 'svg', 'noscript']
doc.querySelectorAll('*').forEach((elm) => {
const tagName = elm.tagName.toLowerCase()