Skip to content

Instantly share code, notes, and snippets.

@zkkmin
Created Mar 28, 2019
Embed
What would you like to do?
curl -s https://www.docdoc.com.sg/medicaltourism_sitemap_profile_1.xml.gz | zcat | xq -r '.urlset.url | map(.loc) | .[]' | sed -e 's/\.com/\.com\.sg/' | xargs curl --user-agent "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -s -I -L --write-out "============== END ==============\n" | tee medicaltourism_sitemap_profile_1_crawl.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment