Skip to content

Instantly share code, notes, and snippets.

@hallvors
Forked from lmandel/getAlexaCountryData.sh
Last active August 29, 2015 13:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hallvors/9931106 to your computer and use it in GitHub Desktop.
Save hallvors/9931106 to your computer and use it in GitHub Desktop.
#!/bin/bash
DATE=`date +%Y%m%d`
ALEXA_URL=http://www.alexa.com/topsites/countries%3B
COUNTRY_CODE=$1
OUTPUT_FILE=ALEXA_${COUNTRY_CODE}-${DATE}.txt
echo "Downloading Alexa top site data for $COUNTRY_CODE"
touch $OUTPUT_FILE
i=0
for i in {0..5}
do
curl "${ALEXA_URL}${i}/${COUNTRY_CODE}" | grep "<a href=\"/siteinfo/.*\"" | sed -r 's/.*>(.*)<.*/\1/' >> $OUTPUT_FILE
# curl "http://www.alexa.com/topsites/countries%3B0/HU" | grep "<a href=\"/siteinfo/.*\"" | sed 's/>(.*)</\"\0\"/' >> $OUTPUT_FILE
i=$i+1
done
wc -l $OUTPUT_FILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment