Skip to content

Instantly share code, notes, and snippets.

@rohan-molloy
Created December 22, 2019 10:12
Show Gist options
  • Save rohan-molloy/b552b2280d734f30443a5cbef7a9c061 to your computer and use it in GitHub Desktop.
Save rohan-molloy/b552b2280d734f30443a5cbef7a9c061 to your computer and use it in GitHub Desktop.

Stripping HTML with Sed

Escape HTML characters

sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g; s/"/\&quot;/g; s/'"'"'/\&#39;/g'        

Unescape HTML characters

sed 's/\&amp;/&/g; s/\&lt;/</g; s/&gt;/>/g; s/\&quot;/"/g;s/\&#39;/'"'"'/g'

Strip HTML tags

sed 's/<[^>]*>//g'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment