Skip to content

Instantly share code, notes, and snippets.

View rufuspollock's full-sized avatar
🌎
Nothing, nowhere and all of it

Rufus Pollock rufuspollock

🌎
Nothing, nowhere and all of it
View GitHub Profile
// the location we want to GeoCode
var location = 'London';
// we are using MapQuest's Nominatim service
var geocode = 'http://open.mapquestapi.com/search?format=json&q=' + location;
// use jQuery to call the API and get the JSON results
$.getJSON(geocode, function(data) {
// get lat + lon from first match
var latlng = [data[0].lat, data[0].lon]
@rufuspollock
rufuspollock / Data-Wrangling-Challenges.md
Last active February 21, 2021 12:32
Data Wrangling Exercise - Natural Gas Prices

Challenge 1

Your task: write a script to get a nice CSV file of natural gas prices.

Please publish your results in a git repo or a gist. Please include both script and your resulting data -- so the CSV files should be stored in the repo too!

More detail:

@rufuspollock
rufuspollock / annotator-openshakespeare-example.js
Created June 10, 2011 11:36
Example of using Annotator in OpenShakespeare.org
jQuery(function ($) {
var elem = $('#text-to-annotate');
var account_id = '39fc339cf058bd22176771b3e3036609';
var annotator_store = '/annostore' + '/api';
var userid = '';
var options = {};
options.permissions = {};
options.permissions.user = {
'name': '194.104.70.73'
};
@rufuspollock
rufuspollock / convert-screencast-to-gif.py
Created May 18, 2014 16:38
Convert screencast (recordmydesktop) to gif
import os
import shutil
def convert():
tmp = '/tmp/togif'
if os.path.exists(tmp):
print('Temp path %s already exists' % tmp)
return
os.makedirs(tmp)
@rufuspollock
rufuspollock / convert_data_package_to_ckan_package.py
Created April 30, 2020 19:05
Convert Data Package to CKAN Package
# python 3+
def convert_data_package_to_ckan_package(data_package):
'''
Documentation of CKAN metadata structure ...
https://docs.ckan.org/en/2.8/api/index.html#ckan.logic.action.create.package_create
https://docs.ckan.org/en/2.8/api/index.html#ckan.logic.action.create.resource_create
'''
out = dict(data_package)
out['extras'] = []
@rufuspollock
rufuspollock / pdf2json-tryout.js
Created July 7, 2013 17:37
Trying out pdf2json
var nodeUtil = require("util"),
PFParser = require("pdf2json")
;
var pdfParser = new PFParser();
pdfParser.on("pdfParser_dataReady", function(data) {
console.log('here');
console.log(data);
console.log(data.data.Pages[0]);
@rufuspollock
rufuspollock / resource-view-demo-data.py
Created February 28, 2012 08:24
Script to Create Demo Resources for Testing Resource Viewer
'''Simple script for creating demo data in CKAN
Requires existence of a tester user. You can create this by doing::
paster create-test-data user
'''
import ckanclient
base_location = 'http://localhost:5000/api'
api_key = 'tester'
@rufuspollock
rufuspollock / pdf2xxx.md
Last active November 15, 2016 15:58
PDF 2 XXX. Tools, libraries and tutorials for converting PDFs to something more machine usable

Additions wanted - please just fork and add.

Tutorials

  • Parsing PDFs by Thomas Levine
  • [Get Started With Scraping – Extracting Simple Tables from PDF Documents][scoda-simple-tables]

Generic (PDF -> text)

@rufuspollock
rufuspollock / datapackage.yml
Created October 19, 2016 15:45
Data Package in YAML from Open Power System Data project.
name: opsd-time-series
title: Time series
description: Load, wind and solar, prices in hourly resolution
long_description: This data package contains different kinds of time series data relevant for power system modelling, namely electricity consumption (load) for 36 European countries as well as...
homepage: http://data.open-power-system-data.org/time_series/2016-07-14/
@rufuspollock
rufuspollock / csv2sqlite.py
Last active April 14, 2016 14:22
UPDATED VERSION NOW AT https://github.com/rgrp/csv2sqlite [Script to load CSV to SQLite]
#!/usr/bin/env python
# A simple Python script to convert csv files to sqlite (with type guessing)
#
# @author: Rufus Pollock
# Placed in the Public Domain
import csv
import sqlite3
def convert(filepath_or_fileobj, dbpath, table='data'):
if isinstance(filepath_or_fileobj, basestring):