Skip to content

Instantly share code, notes, and snippets.

@dusekdan
Last active November 22, 2021 12:52
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dusekdan/e5b25852f9ae28d37a04126fed695408 to your computer and use it in GitHub Desktop.
Save dusekdan/e5b25852f9ae28d37a04126fed695408 to your computer and use it in GitHub Desktop.
Extracts URLs from HTML attributes with type of URI.
-- When fed to XPath selects all URLs from HTML document
-- from within HTML attributes designated to hold URL
-- One-line version
"//a/@href | //applet/@codebase | //area/@href | //base/@href | //blockquote/@cite | //body/@background | //del/@cite | //form/@action | //frame/@src | //frame/@longdesc | //head/@profile | //iframe/@longdesc | //iframe/@url | //img/@longdesc | //img/@usemap | //input/@src | //input/@usemap | //ins/@cite | //object/@classid | //object/@codebase | //object/@data | //object/@usemap | //q/@cite | //img/@src | //link/@href | //source/@src | //embed/@src | //script/@src | //audio/@src | //button/@formaction | //command/@icon | //html/@manifest | //input/@formaction | //video/@poster | //video/@src"
-- Multi-line version (readable)
"//a/@href
| //applet/@codebase
| //area/@href
| //base/@href
| //blockquote/@cite
| //body/@background
| //del/@cite
| //form/@action
| //frame/@src
| //frame/@longdesc
| //head/@profile
| //iframe/@longdesc
| //iframe/@url
| //img/@longdesc
| //img/@usemap
| //input/@src
| //input/@usemap
| //ins/@cite
| //object/@classid
| //object/@codebase
| //object/@data
| //object/@usemap
| //q/@cite
| //img/@src
| //link/@href
| //source/@src
| //embed/@src
| //script/@src
| //audio/@src
| //button/@formaction
| //command/@icon
| //html/@manifest
| //input/@formaction
| //video/@poster
| //video/@src"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment