Skip to content

Instantly share code, notes, and snippets.

View rufuspollock's full-sized avatar
🌎
Nothing, nowhere and all of it

Rufus Pollock rufuspollock

🌎
Nothing, nowhere and all of it
View GitHub Profile
@rufuspollock
rufuspollock / convert_data_package_to_ckan_package.py
Created April 30, 2020 19:05
Convert Data Package to CKAN Package
# python 3+
def convert_data_package_to_ckan_package(data_package):
'''
Documentation of CKAN metadata structure ...
https://docs.ckan.org/en/2.8/api/index.html#ckan.logic.action.create.package_create
https://docs.ckan.org/en/2.8/api/index.html#ckan.logic.action.create.resource_create
'''
out = dict(data_package)
out['extras'] = []
@rufuspollock
rufuspollock / datapackage.yml
Created October 19, 2016 15:45
Data Package in YAML from Open Power System Data project.
name: opsd-time-series
title: Time series
description: Load, wind and solar, prices in hourly resolution
long_description: This data package contains different kinds of time series data relevant for power system modelling, namely electricity consumption (load) for 36 European countries as well as...
homepage: http://data.open-power-system-data.org/time_series/2016-07-14/
@rufuspollock
rufuspollock / unesco.sh
Created March 9, 2016 09:06
UNESCO stats database hacking
curl "http://data.uis.unesco.org/sdmx-json/data/SCN_DS/20600+EXPPPP_CUR+EXPPPP_CONST+21641+21642+21643+21644+45941+21645+21646+45942+21647+EXPPPP_CUR_FS_NATSCI+EXPPPP_CUR_FS_ENGTECH+EXPPPP_CUR_FS_MEDSCI+EXPPPP_CUR_FS_AGSCI+EXPPPP_CUR_FS_NSCIE+EXPPPP_CUR_FS_SOSCI+EXPPPP_CUR_FS_HUM+EXPPPP_CUR_FS_SSCIH+EXPPPP_CUR_FS_FONS+EXPPPP_CONST_FS_NATSCI+EXPPPP_CONST_FS_ENGTECH+EXPPPP_CONST_FS_MEDSCI+EXPPPP_CONST_FS_AGSCI+EXPPPP_CONST_FS_NSCIE+EXPPPP_CONST_FS_SOSCI+EXPPPP_CONST_FS_HUM+EXPPPP_CONST_FS_SSCIH+EXPPPP_CONST_FS_FONS+EXPP_FS_NATSCI+EXPP_FS_ENGTECH+EXPP_FS_MEDSCI+EXPP_FS_AGSCI+EXPP_FS_NSCIE+EXPP_FS_SOSCI+EXPP_FS_HUM+EXPP_FS_SSCIH+EXPP_FS_FONS.ALB+DZA+ASM+AGO+ARG+ARM+AUS+AUT+AZE+BHR+BGD+BLR+BEL+BEN+BMU+BOL+BIH+BWA+BRA+BRN+BGR+BFA+BDI+KHM+CMR+CAN+CPV+CAF+CHL+CHN+HKG+MAC+COL+COG+CRI+CIV+HRV+CUB+CYP+CZE+COD+DNK+ECU+EGY+SLV+EST+ETH+FRO+FIN+FRA+GAB+GMB+GEO+DEU+GHA+GRC+GRL+GUM+GTM+GIN+HND+HUN+ISL+IND+IDN+IRN+IRQ+IRL+ISR+ITA+JAM+JPN+JOR+KAZ+KEN+KWT+KGZ+LAO+LVA+LSO+LBY+LTU+LUX+MDG+MWI+MYS+MLI+MLT+MUS+MEX+MCO+MNG+MNE+MAR+MO
@rufuspollock
rufuspollock / data-package-name-and-id-proposal.md
Created December 31, 2015 18:36
Data Package identification and naming

Currently Data Packagese must have a name attribute but do not have an id attribute.

There has been debate about both the semantics (e.g. uniqueness) of the name field and its usability for certain cases (e.g. importing datasets into a new catalog) - see #220 for extensive discussions.

Proposal

Two identifier fields:

  • name: SHOULD be present (and certainly required for installation etc). Name is human meaningful and is designed to support both resolution (protocol to be determined) and easy use by humans e.g. in data dependencies
    • (?) Have this as a MUST?
@rufuspollock
rufuspollock / Data-Wrangling-Challenges.md
Last active February 21, 2021 12:32
Data Wrangling Exercise - Natural Gas Prices

Challenge 1

Your task: write a script to get a nice CSV file of natural gas prices.

Please publish your results in a git repo or a gist. Please include both script and your resulting data -- so the CSV files should be stored in the repo too!

More detail:

We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 17 columns, instead of 10. in line 3.
year,admin1,admin2,admin3,admin4,admin5,admin6,func1,func2,econ1,econ2,fin_source,exp_type,transfer,approved,adjusted,executed
2009,Central,101 Parliament,0101 Parliament,,,010 Central apparatus (office) of ministries and other administrative authorities,01 General purpose state services,01.01 Legislative authorities,111 Remuneration of work,111.00 Remuneration of work,Base component,Personnel,Excluding transfers,30269300,30269300,27849186
2009,Central,101 Parliament,0101 Parliament,,,010 Central apparatus (office) of ministries and other administrative authorities,01 General purpose state services,01.01 Legislative authorities,112 Mandatory state social insurance premiums,112.00 Mandatory state social insurance premiums,Base component,Personnel,Excluding transfers,5564000,5564000,5401021
2009,Central,101 Parliament,0101 Parliament,,,010 Central apparatus (office) of ministries and other administrative authorities,01 General purpose state services,01.01 Legislative authorities,113 Payment for goods and servic
@rufuspollock
rufuspollock / gist:ca4ac7d2511ee41237b9
Created November 9, 2014 21:37
CKAN DataStore SQL API from Javascript
// replace this with your CKAN website
var ckanSite = 'http://datahub.io'
var sql = 'Your SQL goes here';
// =================
// Using jQuery only
// =================
var data = encodeURIComponent(JSON.stringify({sql: sql}));
@rufuspollock
rufuspollock / humanitarian-datastore-data-api-examples.md
Last active August 29, 2015 14:03
Humanitarian dataset example queries

HDX Common Humanitarian Dataset data into CKAN instance (we used datahub.io for convenience).

http://datahub.io/dataset/hdx-common-humanitarian-dataset

We've loaded (indicator) value table and indicator table separately in the CKAN DataStore (we have not bothered loading dataset table for the present) and we've also created a python script to automate this (which can also serve as an example of how to work with CKAN API).

Setting this up was pretty fast (most of the work was actually tidying up the data and then making some scripts to make this repeatable and testable).

@rufuspollock
rufuspollock / home.md
Created May 21, 2014 16:50
meta.census.okfn.org - home page

[notitle] [fullwidth]

Get your local open data census here!
Local data is often the most relevant to citizens on a daily basis - from rubbish collection times to local tax rates.
At the moment it’s hard to know what local open data is available.

This is a short introduction to how to administer an Open Data Census.

Note: it assumes that a census instance has been booted for you (and is not about the technical side of deploying a census instance)

[toc]

Overview of How a Census is Structured

A Census is a survey built around 4 axes: