Skip to content

Instantly share code, notes, and snippets.

View drkane's full-sized avatar

David Kane drkane

View GitHub Profile
@drkane
drkane / to_excel_table.py
Last active January 9, 2023 15:18
Pandas to_excel with a table
# from https://xlsxwriter.readthedocs.io/example_pandas_table.html
df.index.name = "org_id"
criteria = df["Include"]=="y"
sheets = {
"Included - active": (criteria & df["active"]),
"Included - inactive": (criteria & ~df["active"]),
"Excluded": ~criteria,
}
column_widths = df.apply(lambda x: x.astype(str).str.len().max(), axis=0).apply(lambda x: min(x, 50)).to_dict()
@drkane
drkane / companies_house_sic_codes.csv
Created July 13, 2015 15:08
A lookup table from the SIC codes that Companies House publish in their open data project to SIC2007 codes. In some cases older SIC2003 codes have been "forced" into SIC2007 categories, you may not agree with these decisions. http://download.companieshouse.gov.uk/en_output.html
SICcode SICtext Number of Companies SIC2007
111 Grow cereals & other crops 1506
112 Grow vegetables & nursery products 637
113 Grow fruit, nuts, beverage & spice crops 101
121 Farming of cattle, dairy farming 666
122 Farm sheep, goats, horses, etc. 513
123 Farming of swine 136 01.46
124 Farming of poultry 283 01.47
125 Other farming of animals 288
130 Crops combined with animals, mixed farms 801 01.50
@drkane
drkane / 01_basic_queries.sql
Last active June 9, 2021 09:39
Queries that could be run on the 360Giving data store
-- required fields from 360Giving standard
-- http://standard.threesixtygiving.org/en/latest/reference/#grants-sheet
with g as (select * from view_latest_grant)
select g.data->>'id' as "id",
g.data->>'title' as "title",
g.data->>'description' as "description",
g.data->>'currency' as "currency",
(g.data->>'amountAwarded')::float as "amountAwarded",
to_date(g.data->>'awardDate', 'YYYY-MM-DD') as "awardDate",
We can't make this file beautiful and searchable because it's too large.
"reg_number","name","constitution","geographical_spread","address","postcode","phone","active","date_registered","date_removed","web","email","company_number","objects","source","last_updated","income","spending","fye","dual_registered"
SC004401,University Of Glasgow Court,Statutory corporation (Royal Charter etc),UK and overseas,"Directorate, Finance Office, Gilbert Scott Building, University Avenue, Glasgow",G12 8QQ,,true,1992-01-01,,www.gla.ac.uk,,,"The advancement of education. The advancement of health. The advancement of, arts, heritage, culture, or science.",Office of Scottish Charity Regulator,2021-01-25 18:48:11,690648000,758804000,2019-07-31,false
SC015263,University Of Strathclyde,Statutory corporation (Royal Charter etc),UK and overseas,"Finance, McCance Building, 16 Richmond Street, Glasgow",G1 1XQ,,true,1965-04-02,,www.strath.ac.uk,,,"The advancement of education. The advancement of arts, heritage, culture, or science.",Office of Scottish Charity Regulator,2021-01-25 18:48:20,354611000,404702000
@drkane
drkane / example_additional_data.json
Created February 16, 2021 16:36
Examples of the additional_data field from the 360Giving datastore
{
"id_and_name": "[\"Exeter City Council\", \"GB-LAE-EXE\"]",
"locationLookup": [
{
"ewcd": "K04000001",
"ewnm": "England and Wales",
"gbcd": "K03000001",
"gbnm": "Great Britain",
"ukcd": "K02000001",
"uknm": "United Kingdom",
ladcd ladnm
E09000002 Barking and Dagenham
E09000003 Barnet
E09000004 Bexley
E09000005 Brent
E09000006 Bromley
E09000008 Croydon
E09000009 Ealing
E09000010 Enfield
E09000013 Hammersmith and Fulham
@drkane
drkane / banding_bins.py
Created November 23, 2020 12:24
Bins used for age, income and award amount for charity/grants data
# Bins used for numeric fields
AMOUNT_BINS = [0, 500, 1000, 2000, 5000, 10000, 100000, 1000000, float("inf")]
AMOUNT_BIN_LABELS = ["Under £500", "£500 - £1k", "£1k - £2k", "£2k - £5k", "£5k - £10k",
"£10k - £100k", "£100k - £1m", "Over £1m"]
INCOME_BINS = [-1, 10000, 100000, 250000, 500000, 1000000, 10000000, float("inf")]
INCOME_BIN_LABELS = ["Under £10k", "£10k - £100k", "£100k - £250k",
"£250k - £500k", "£500k - £1m", "£1m - £10m", "Over £10m"]
AGE_BINS = pd.to_timedelta(
[x * 365 for x in [-1, 1, 2, 5, 10, 25, 200]], unit="D")
AGE_BIN_LABELS = ["Under 1 year", "1-2 years", "2-5 years",
@drkane
drkane / la_ecodes.csv
Created July 7, 2015 12:18
CSV list of official UK ONS geography administrative codes mapped to DCLG "ecodes" for England
LAcode geog_LA Ecode
E06000001 Hartlepool E0701
E06000002 Middlesbrough E0702
E06000003 Redcar and Cleveland E0703
E06000004 Stockton-on-Tees E0704
E06000005 Darlington E1301
E06000006 Halton E0601
E06000007 Warrington E0602
E06000008 Blackburn with Darwen E2301
E06000009 Blackpool E2302
@drkane
drkane / part-b-hierarchy.md
Last active May 17, 2020 19:21
Charity Commission Part B fields

extract_partb

Annual Return Detail table. Row for each AR for each registered main charity. Contains a row for each year for each charity, with more detailed financial information. Charities only have to fill in this information if their income is greater than £500,000 in that year.

Field name Data type Description Note
@drkane
drkane / lsoa_full.csv
Created May 6, 2020 13:29
LSOAs matched to MSOA, TTWA, Local Authorities (2011), Local Authorities (2020), upper tier local authorities (2020) and countries
We can't make this file beautiful and searchable because it's too large.
LSOA11CD,LSOA11NM,MSOA11CD,MSOA11NM,TTWA11CD,TTWA11NM,LAD11CD,LAD11NM,LAD11NMW,LAD20CD,LAD20NM,LAD20NMW,CTY20CD,CTY20NM,CTRYCD,CTRYNM
95DD04W1,Ballee,95DD04W1,Ballee,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland
95DD05W1,Ballykeel,95DD05W1,Ballykeel,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland
95DD06W1,Ballyloughan,95DD06W1,Ballyloughan,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland
95DD07S1,Broughshane 1,95DD07S1,Broughshane 1,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland
95DD07S2,Broughshane 2,95DD07S2,Broughshane 2,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northe