This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# Parse a data table from pollster.com | |
# Get data via copy-and-paste from http://www.pollster.com/polls/us/08-us-pres-ge-mvo.php | |
# yields a messy tab-separated thingamajigger (i'm using firefox 3 on mac) | |
# This script normalizes, in an R-friendly way | |
require 'date' | |
def numclean(x) | |
x =~ /^-$/ ? "NA" : x.to_i |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<? | |
/** | |
* *** THIS NEEDS TO BE EDITED FOR A NEW INSTALLATION *** | |
* | |
* this is supposed to be called as e.g. | |
* http://anyall.org/blog/blogger/http://socialscienceplusplus.blogspot.com/2008/10/mydebatesorg-and-poten | |
tially-coolest.html | |
* and then redirect to e.g. | |
* http://anyall.org/blog/2008/10/mydebatesorg-online-polling-and-potentially-the-coolest-question-corpus- | |
ever/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
load '~/.irbrc' # dotfiles.org/~brendano/.irbrc | |
require 'hpricot' | |
sites=[] | |
for url in [ | |
"http://www.alexa.com/site/ds/top_sites?ts_mode=lang〈=en"] | |
h = Hpricot open(url).read | |
sites += (h/'h3'/'a').map{|x| x['href']} | |
end | |
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 9 columns, instead of 5. in line 7.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
name,score_skewz,score_svd,url,v1,v2,v3,v4,v5 | |
The Politico,-0.133333333333333,-0.069840595513546,politico.com,-0.0579919888228,-0.0156533209161,-0.0118276408031,-0.000672353189093,0.00899951990495 | |
Right Wing Nut House,0.666666666666667,0.016997861495122,rightwingnuthouse.com,-0.0114438419789,0.00923210186058,-0.000332659887795,-0.00357075698976,0.0194133595538 | |
Chicago Tribune,0.0,0.011507686305562,chicagotribune.com,-0.00487815404818,0.0062502057793,0.00472616298604,-0.00370269426842,-0.00354255787188 | |
City Journal,0.566666666666667,0.002719928640919,city-journal.org,-0.000318806368726,0.00147728337907,0.000218460777,-0.000500262448403,-0.00112420748062 | |
Time,-0.1,-0.01921486123282,time.com,-0.0206799675285,-0.00430661260867,-0.00335205354211,-0.00167995286891,-0.0152016073966 | |
National Enquirer,0.533333333333333,-0.008120760725041,nationalenquirer.com,-0.00279469690892,-0.0018201000833,-0.00761346294708,0.00713945342214,-0.00165965873961 | |
AlterNet,-0.633333333333333,-0.029834727529704,alternet.org,-0.0066 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Pipe-oriented I/O in Python. This is harder than it should be. | |
# (1) Kill stdout buffering. makes redirects and tee easier to use. | |
if "<fdopen>" not in str(sys.stdout): sys.stdout = os.fdopen(1,'w',0) | |
# (2) Encoding madness. Note codecs.open() isn't available to us since we're using pipes. | |
import codecs | |
sys.stdout = codecs.EncodedFile(sys.stdout,'utf-8','utf-8','ignore') | |
# or this too .. sys.stdout = codecs.getwriter('utf-8')(sys.stdout) | |
# I'm interested in safely handling potentially garbled input data, so want to protect stdin. | |
# You'd think this would work: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""ajaxgoogle.py - Simple bindings to the AJAX Google Search API | |
(Just the JSON-over-HTTP bit of it, nothing to do with AJAX per se) | |
http://code.google.com/apis/ajaxsearch/documentation/reference.html#_intro_fonje | |
brendan o'connor - gist.github.com/28405 - anyall.org""" | |
try: | |
import json | |
except ImportError: | |
import simplejson as json | |
import urllib, urllib2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
export TAB=$(echo -e "\t") | |
exec sort "-t$TAB" "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CSV from PostgreSQL, at least as far as I can tell. i'm sure messes up embedded quotes and maybe embedded commas. | |
psql.csv() { psql -qAF , "$@" | egrep -v '^\([0-9]+ rows\)$' } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
puts "Content-Type: text/plain" | |
puts | |
subj = ENV['PATH_INFO'] || "" | |
subj.gsub!("'", '"') | |
msg = STDIN.read || "" | |
# system "env" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# like map(), except on shell pipelines | |
# one arg: the mapper | |
# transform input lines, via the mapper, into output lines | |
# mapper is eval'd within the input line string | |
# | |
# extract 2nd column | |
# cat file | map 'split[1]' |