Skip to content

Instantly share code, notes, and snippets.

@9214
Last active April 28, 2024 16:16
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 9214/4b8d66800bfdebd6a4f2f00258f6cfc0 to your computer and use it in GitHub Desktop.
Save 9214/4b8d66800bfdebd6a4f2f00258f6cfc0 to your computer and use it in GitHub Desktop.
The world's smallest web scraper.
Red [
Title: "A Parse-based port of Michael Gilliland's web scraper"
Author: @9214
Date: 16-Oct-2020
Link: https://youtu.be/HDMa4gcgEgI
]
page: read-thru/binary to url! system/script/args
link: [copy match [ahead https:// url!]]
rule: [collect any [link keep (to url! match) | skip]]
parse page rule
@9214
Copy link
Author

9214 commented Oct 16, 2020

Example:

>> probe new-line/all do/args %scrape.red https://duckduckgo.com on
[
    https://duckduckgo.com/ 
    https://duckduckgo.com/ 
    https://duckduckgo.com/assets/logo_social-media.png
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment