Skip to content

Instantly share code, notes, and snippets.

@sergiolopes
Created June 6, 2016 18:58
Show Gist options
  • Save sergiolopes/ce541060c69c93928a76ea0867b206cb to your computer and use it in GitHub Desktop.
Save sergiolopes/ce541060c69c93928a76ea0867b206cb to your computer and use it in GitHub Desktop.
Extrai textos das noticias do nerdnews
#!/bin/bash
for PAGINA in {1..2}; do
curl -s -L https://jovemnerd.com.br/categoria/nerdnews/page/$PAGINA/ | sed -n 's/.*href="\(https:\/\/jovemnerd.com.br\/nerdnews\/[^"]*\).*/\1/p'
done | \
while read link; do
curl -s $link > /tmp/nerdtech.html
xmllint --html --xpath '//div[@class="title"]/h2/text()' /tmp/nerdtech.html 2> /dev/null
echo
xmllint --html --xpath '//*[@class="user-entry"]/p//text()' /tmp/nerdtech.html 2> /dev/null
echo
echo
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment