Skip to content

Instantly share code, notes, and snippets.

View rufuspollock's full-sized avatar
🌎
Nothing, nowhere and all of it

Rufus Pollock rufuspollock

🌎
Nothing, nowhere and all of it
View GitHub Profile
@ryangray
ryangray / buttondown.css
Created February 22, 2012 06:45
A clean, minimal CSS stylesheet for Markdown, Pandoc and MultiMarkdown HTML output.
/*
Buttondown
A Markdown/MultiMarkdown/Pandoc HTML output CSS stylesheet
Author: Ryan Gray
Date: 15 Feb 2011
Revised: 21 Feb 2012
General style is clean, with minimal re-definition of the defaults or
overrides of user font settings. The body text and header styles are
left alone except title, author and date classes are centered. A Pandoc TOC
@rufuspollock
rufuspollock / csv2sqlite.py
Last active April 14, 2016 14:22
UPDATED VERSION NOW AT https://github.com/rgrp/csv2sqlite [Script to load CSV to SQLite]
#!/usr/bin/env python
# A simple Python script to convert csv files to sqlite (with type guessing)
#
# @author: Rufus Pollock
# Placed in the Public Domain
import csv
import sqlite3
def convert(filepath_or_fileobj, dbpath, table='data'):
if isinstance(filepath_or_fileobj, basestring):
@max-mapper
max-mapper / readme.md
Last active October 20, 2020 03:21
node modules for converting PDFs into other formats
@rufuspollock
rufuspollock / pdf2xxx.md
Last active November 15, 2016 15:58
PDF 2 XXX. Tools, libraries and tutorials for converting PDFs to something more machine usable

Additions wanted - please just fork and add.

Tutorials

  • Parsing PDFs by Thomas Levine
  • [Get Started With Scraping – Extracting Simple Tables from PDF Documents][scoda-simple-tables]

Generic (PDF -> text)

@rufuspollock
rufuspollock / london-spend-csvs-grepping-for-headings.txt
Last active December 19, 2015 10:29
Analysis of where the "header" rows actually appears in GLA spend data CSVs. Result of running this script https://github.com/rgrp/dataset-gla/blob/master/scripts/headings.sh. For details of files see https://github.com/rgrp/dataset-gla/blob/master/scrape.json
2010-11-P01.csv:4:Vendor,Expense Description,Amount,Doc No,,,^M
2010-11-P02.csv:6:Vendor,Expense Description,Amount,Doc No,,,^M
2010-11-P03.csv:6:Document No","Amount
2010-11-P04-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P05-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P06-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P07-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P08-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P09-500.csv:1:Vendor ID,Vendor Name,Cost Element,Expenditure Account Code Description,SAP Document No,Amount £,Clearing Date^M
2010-11-P10-500.csv:1:Vendor ID,Vendor Name,Cos
@johnkoht
johnkoht / grid.css.sass
Last active January 1, 2022 23:21
Bootstrap 3 Style Grid built on Bourbon Neat
// Main containers
.container
@include outer-container
// Rows
.row
@include row()
// A basic column without a defined width or height
@rufuspollock
rufuspollock / README.md
Created March 6, 2014 19:54
Hackney Spending Cleanup - README is empty

README is empty

@rufuspollock
rufuspollock / humanitarian-datastore-data-api-examples.md
Last active August 29, 2015 14:03
Humanitarian dataset example queries

HDX Common Humanitarian Dataset data into CKAN instance (we used datahub.io for convenience).

http://datahub.io/dataset/hdx-common-humanitarian-dataset

We've loaded (indicator) value table and indicator table separately in the CKAN DataStore (we have not bothered loading dataset table for the present) and we've also created a python script to automate this (which can also serve as an example of how to work with CKAN API).

Setting this up was pretty fast (most of the work was actually tidying up the data and then making some scripts to make this repeatable and testable).

@rufuspollock
rufuspollock / gist:ca4ac7d2511ee41237b9
Created November 9, 2014 21:37
CKAN DataStore SQL API from Javascript
// replace this with your CKAN website
var ckanSite = 'http://datahub.io'
var sql = 'Your SQL goes here';
// =================
// Using jQuery only
// =================
var data = encodeURIComponent(JSON.stringify({sql: sql}));
@pudo
pudo / schema_proposal.yaml
Created May 5, 2015 18:32
A proposed metadata structure for OpenSpending raw data.
# This is an alternate proposal for a metadata structure for OpenSpending
# data models. The most significant change is that data is modelled in a
# way that highlights logical connections between fields, rather based on
# columns. This also means that column naming conventions are not needed.
#
# This proposal uses YAML to represent the model, but implementations
# would probably use JSON instead.
# The proposed format is currently supported by spendb and cubepress.
#
# The following is a data model for a fictitious budget/spending dataset.