Skip to content

Instantly share code, notes, and snippets.

View rufuspollock's full-sized avatar
🌎
Nothing, nowhere and all of it

Rufus Pollock rufuspollock

🌎
Nothing, nowhere and all of it
View GitHub Profile
@rufuspollock
rufuspollock / openspending-local-load-step-by-step.rst
Created November 11, 2011 14:36
OpenSpending local dataset load from command line

Test model (dimensions and mapping):

ostool cfg.ini csvimport --model=model.json --dry-run --raise-on-error --max-lines=1 data.csv

Dry run:

ostool cfg.ini csvimport --model=model.json --dry-run data.csv
@rufuspollock
rufuspollock / ckan-es-webstore-test-full.py
Created February 28, 2012 08:20
CKAN ES Webstore Test Full (Read/Write)
'''This is a test using the real setup with elasticsearch.
It requires you to run nginx on port 8088 with config as per
https://github.com/okfn/elastic-proxy/blob/master/elasticproxy plus,
obviously, elasticsearch on port 9200.
'''
import json
import paste.fixture
import paste.proxy
@rufuspollock
rufuspollock / ckan-datastore.py
Created March 1, 2012 15:39
CKAN DataStore client
#!/usr/bin/env python
import urlparse
import mimetypes
import os
import ConfigParser
import urllib2
import json
import csv
import time
import csv
import json
import geojson
fp = 'data/US_Rendition_FOIA.csv'
fpout = 'data/US_Rendition_FOIA.geojson.csv'
jsonout = 'data/US_Rendition_FOIA.geojson.json'
jsondata = []
def convert():
@rufuspollock
rufuspollock / world-bank-pop-sample-xml.xml
Created May 24, 2012 09:35
World Bank Population - Sample XML data
<?xml version="1.0" encoding="utf-8"?>
<Root xmlns:wb="http://www.worldbank.org">
<data>
<record>
<field name="Country or Area" key="ARB">Arab World</field>
<field name="Item" key="SP.POP.TOTL">Population, total</field>
<field name="Year">1960</field>
<field name="Value">96388069</field>
</record>
<record>
@rufuspollock
rufuspollock / note-load.js
Created June 2, 2012 11:30
Time/Geo notes and script to parse notes and save to file or load to ElasticSearch
// Parse a summary to extract title, tags, location and start and end
parseNoteSummary = function(text) {
var result = {
title: '',
tags: []
};
var ourtext = text;
regex = / #([\w-\.]+)/;
while(ourtext.search(regex)!=-1) {
var out = ourtext.match(regex)[1];
@rufuspollock
rufuspollock / tfl_passengers.csv
Created September 29, 2012 12:54
Laundromat - TFL Example CSV
We can make this file beautiful and searchable if this error is corrected: It looks like row 9 should actually have 21 columns, instead of 5. in line 8.
,Millions LU journeys adjusted for odd days,Millions bus journeys adjusted for odd days - new measure,Millions bus plus underground journeys adjusted for odd days - new measure,LU Average,Bus Average,LU plus bus average,LU growth,Bus growth,LU plus bus growth,LU moving average annual growth,Bus moving average annual growth,LU plus busmoving average annual growth,,,,,,,,
2006/2007 - 1,72.3,151.3,223.5,73.9,138.8,212.7,-5.4%,-1.5%,-2.9%,-1.6%,0.6%,-0.2%,,,,,,,,
2006/2007 - 2,75.6,158.9,234.5,73.8,139.1,212.9,-2.5%,3.4%,1.3%,-2.1%,0.8%,-0.2%,,,,,,,,
2006/2007 - 3,74.3,158.4,232.7,73.6,139.6,213.2,-2.7%,4.4%,1.9%,-2.6%,1.1%,-0.2%,,,,,,,,
2006/2007 - 4,77.4,161.5,238.9,74.1,140.1,214.2,8.2%,4.9%,6.0%,-1.7%,1.4%,0.3%,,,,,,,,
2006/2007 - 5,73.7,153.5,227.2,74.8,141.1,215.9,15.0%,9.9%,11.6%,0.2%,2.3%,1.6%,,,,,,,,
2006/2007 - 6,74.1,153.2,227.3,75.2,141.7,216.9,7.9%,5.8%,6.5%,1.2%,3.0%,2.3%,,,,,,,,
2006/2007 - 7,81.1,165.1,246.2,75.6,142.1,217.7,6.1%,4.0%,4.8%,1.8%,3.5%,2.9%,,,,,,,,
2006/2007 - 8,83.4,166.5,250.0,76.0
@rufuspollock
rufuspollock / data.json
Created October 1, 2012 18:32
CKAN - Load Demo Data
{
"datasets": {
"adur_district_spending": {
"author": "Lucy Chambers",
"author_email": "",
"extras": {
"spatial-text": "Adur, West Sussex, South East England, England, United Kingdom",
"spatial": "{ \"type\": \"Polygon\", \"coordinates\": [ [ [-0.3715, 50.8168],[-0.3715, 50.8747], [-0.2155, 50.8747], [-0.2155, 50.8168], [-0.3715, 50.8168] ] ] }"
},
"license": "License Not Specified",
@rufuspollock
rufuspollock / scrape.js
Created October 20, 2012 12:06
Police.uk data scraping
var jsdom = require('jsdom');
var fs = require('fs');
// var jquery = fs.readFileSync("./jquery-1.7.1.min.js").toString();
var linklist = 'http://police.uk/data';
jsdom.env({
html: linklist,
scripts: [
'http://code.jquery.com/jquery.js'
@rufuspollock
rufuspollock / upload.py
Created October 21, 2012 21:33
Upload data wrangling handbook to wordpress
''' Upload datawrangling handbook to wordpress site.
Copy this file to same directory as your sphinx build directory and then do
python upload.py -h
NB: You need to enable XML-RPC access to the wordpress site (via Settings -> Writing)
NB: this requires pywordpress (pip install pywordpress) and associated config
file - see https://github.com/rgrp/pywordpress