Skip to content

Instantly share code, notes, and snippets.

@dwillis
Created September 2, 2011 17:46
Show Gist options
  • Save dwillis/1189275 to your computer and use it in GitHub Desktop.
Save dwillis/1189275 to your computer and use it in GitHub Desktop.
Who Needs to Write Scrapers?
cd my_download_dir
curl http://apps.sd.gov/applications/ST12ODRS/LobbyistViewlist.asp?start=[1-8101:20] -o "lobby_#1.html"
# get a cup of coffee
cat lobby*.html > master.html # on mac/linux, on windows try copy lobby*.html master.html
-----
Then open master.html in a text editor, do some smart bulk find and replaces, and you'll be in business. After opening the cleaned up file in Excel, you'll need to un-merge the cells and do a little more cleanup, but it works.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment