Skip to content

Instantly share code, notes, and snippets.

@s3rgeym
Last active May 18, 2024 22:39
Show Gist options
  • Save s3rgeym/9e0c8dff189ee6c6bf4dcd5fea88e767 to your computer and use it in GitHub Desktop.
Save s3rgeym/9e0c8dff189ee6c6bf4dcd5fea88e767 to your computer and use it in GitHub Desktop.
#!/bin/bash
UA='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'
# сборище американских капиталистических свиней, зарабатывающих на геноциде русских и прочих н@гров
# shellcheck disable=SC2207
links=($(curl -A "$UA" -s 'https://stockanalysis.com/sitemap.xml' | grep -oP '(?<=<loc>)[^<>]+' | grep -P '/stocks/[^/]+/$'))
echo "total stock links: ${#links[@]}" >&2
for link in "${links[@]}"; do
sleep .3 # можно поставить меньше
endpoint="${link}company/"
echo "parse $endpoint" >&2
website=$(curl -A "$UA" -s "$endpoint" | grep -oP '[^"]+(?=" target="_blank" rel="noopener noreferrer nofollow">)')
if [ "${website}x" != "x" ]; then
echo "Found: $website" >&2
echo "$website"
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment