Skip to content

Instantly share code, notes, and snippets.

@JaimeObregon
Created October 12, 2010 10:14
Show Gist options
  • Save JaimeObregon/621982 to your computer and use it in GitHub Desktop.
Save JaimeObregon/621982 to your computer and use it in GitHub Desktop.
#!/bin/bash
for i in `seq 0 50 550`; do wget "http://comerciocantabria.com/index.php?option=com_comprofiler&task=usersList&listid=4&Itemid=174&limitstart=$i" -Ocomercios.`expr $i / 50 + 1`.txt; done
for i in comercios*txt; do grep "<table id=\"cbUserTable\"" $i -A1000 | grep "</table" -B 1000 | grep span | sed 's/\t*<div class=\"cbUserListFieldLine\"><span class=\"cbListFieldCont cbUserListFC_//' | sed 's/<\/span><\/div>//' | cut -d ">" -f 2,3,4; done | grep -i "http://[^comercio]" | sort | uniq > urls.txt
for i in `cat urls.txt` ; do wget "$i/index.php?option=com_virtuemart&Itemid=60&category_id=0&page=shop.browse&limitstart=0&limit=10000" -q -O`echo $i | sed 's/http:\/\///'`; done
for i in www*; do cat $i | grep "id=\"product_list\"" -A10000 | grep "r class=\"clr" -B10000 | grep -i h3 | cut -d "\"" -f 4 | grep -v "h3" | sort > productos.$i; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment