Skip to content

Instantly share code, notes, and snippets.

@rufuspollock
Last active August 29, 2015 14:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rufuspollock/0b24589bd3293234917b to your computer and use it in GitHub Desktop.
Save rufuspollock/0b24589bd3293234917b to your computer and use it in GitHub Desktop.
Humanitarian dataset example queries

HDX Common Humanitarian Dataset data into CKAN instance (we used datahub.io for convenience).

http://datahub.io/dataset/hdx-common-humanitarian-dataset

We've loaded (indicator) value table and indicator table separately in the CKAN DataStore (we have not bothered loading dataset table for the present) and we've also created a python script to automate this (which can also serve as an example of how to work with CKAN API).

Setting this up was pretty fast (most of the work was actually tidying up the data and then making some scripts to make this repeatable and testable).

More details on all of this can be found in this repo (which also includes the versions of the data we used):

https://github.com/rgrp/hdx-common-humanitarian-dataset

With the data in the CKAN DataStore you have "out-of-the-box" a pretty rich JSON API including full SQL-based search (including the ability to join between different resources (aka tables)

  • e.g. you can join between the value and the indicator table). Below are some live examples of the queries you can do.

I think this kind of API could be a pretty reasonable basis for building a proper JS frontend app. (In fact, modulo some messing around to get proper pivoting (e.g. via postgresql crosstab function) you've probably got exactly the data you need for a basic indicator visualization app)

API Examples

More on the search API can be found here:

http://docs.ckan.org/en/latest/maintaining/datastore.html#ckanext.datastore.logic.action.datastore_search http://docs.ckan.org/en/latest/maintaining/datastore.html#ckanext.datastore.logic.action.datastore_search_sql

There's also HtSQL support if you want it: http://docs.ckan.org/en/latest/maintaining/datastore.html#datastore-search-htsql

Here are some live examples to demonstrate what you can do:

# note: resource_id is just the id the CKAN resource for all the values
# note: you'll want to make sure your browser shows JSON nicely before viewing these

# query for KEN (kenya) - using full-text search
http://datahub.io/api/action/datastore_search?resource_id=b23bcbda-8d22-4d22-ae56-1a287941147a&q=KEN

# alternative method explicitly filtering on that column
http://datahub.io/api/action/datastore_search?resource_id=b23bcbda-8d22-4d22-ae56-1a287941147a&filters={%22region%22:%22ken%22}

Here's an example SQL query which selects the first 10 rows from the JOIN of the the value and the indicator table:

http://datahub.io/api/action/datastore_search_sql?sql=SELECT%20region,%20indicator_id,%20period,%20value,%20name,%20units%20FROM%20%22b23bcbda-8d22-4d22-ae56-1a287941147a%22%20JOIN%20%22ea9f7969-e290-47ad-9033-f54e3ea23ba3%22%20ON%20indicator_id%20=%20id%20LIMIT%2010;

There's also a rough and ready little HTML + JS app for exploring the DataStore API here:

http://dev.rufuspollock.org/ckan-explorer/?endpoint=http://datahub.io&resource=ea9f7969-e290-47ad-9033-f54e3ea23ba3

You can use this play around with SQL queries. For example, you can paste this query and run it:

SELECT region, indicator_id, period, value, name, units FROM "b23bcbda-8d22-4d22-ae56-1a287941147a" JOIN "ea9f7969-e290-47ad-9033-f54e3ea23ba3" ON indicator_id = id WHERE indicator_id = 'CHD.B.OTH.04.T6' AND region = 'KEN' ORDER BY period;

This is internet users in Kenya sorted by period (you can then select the graph and graph this).

Note: it looks like there are some data issues here since value for 1984 is over 5 billion (have indicator ids got shared??)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment