Skip to content

Instantly share code, notes, and snippets.

View JoeGermuska's full-sized avatar

Joe Germuska JoeGermuska

View GitHub Profile
JoeGermuska /
Last active August 18, 2022 21:00
Chicago Community Areas with region and key neighborhoods, and 2020 population by race

I had reason to want a list of Chicago Community Areas annotated to assign each to a region of the city. I couldn't find it in structured form, so I built it.

I used this research guide from Harold Washington College Libraries, which assigned each community area to a region and listed key neighborhoods. I simplified "Central, Near North, and Near South Side" to just "Central". I also simplified "West and Near West Side" to just "West."

Since part of my project included aligning population numbers, the CSV in this gist also includes a simplified version of the 2020 Decennial Census redistricting table P2, "Hispanic or Latino, and not Hispanic or Latino by Race," as downloaded from Census Reporter. The simplifications were to rename columns, and to omit all of the detailed columns for "two or more races." I also dropped the 2010 and percentage change columns, and computed the per

JoeGermuska / encoding_fixup.sql
Last active September 8, 2022 19:41
SQL to fix importing Latin-1 text as if it were UTF-8
-- Sometimes one accidentally loads data that is in ISO8859-1 (aka "Latin-1") encoding having assumed that it was actually UTF-8
-- so far it seems like à is a good flag although if your data might also have that correctly, this is less simple...
update tiger2020.census_name_lookup
set simple_name = replace(simple_name, 'ñ', 'ñ' ),
display_name = replace(display_name, 'ñ', 'ñ' ),
prefix_match_name = replace(prefix_match_name, 'ñ', 'ñ' );
update tiger2020.census_name_lookup
set simple_name = replace(simple_name, 'ü', 'ü' ),
display_name = replace(display_name, 'ü', 'ü' ),
prefix_match_name = replace(prefix_match_name, 'ü', 'ü' );
JoeGermuska /
Last active December 14, 2023 19:55
Read Census 2020 PL94-171 ("redistricting") files into Pandas DataFrames

Some quick work to facilitate reading data for the Census 2020 PL94-171 data release into Pandas dataframes.

Sample data for Providence County, RI can be downloaded from, as can auxiliary materials.

The file was created by parsing the SAS import scripts from the link above.

It seems as though the Census Bureau removed the sample data for Providence County, RI, against which this code was tested. You can get a copy of it from

Note: The full data release is now available at

JoeGermuska /
Created September 10, 2020 22:22
Get last commit dates for remote git branches

I had a repo with dozens of abandoned or merged remote branches. Before deleting them wholesale, I wanted to know just how old they were.

I adapted this Stack Overflow answer, but dressed it up in a for loop since I wanted all of them. Also, since I wanted to sort them, I switched the date format to %cs (Short ISO), which didn't work in git 2.18.0 but does work in 2.28.0

JoeGermuska / fipsToState.json
Last active August 13, 2020 20:34 — forked from wavded/fipsToState.json
"01": "Alabama",
"02": "Alaska",
"04": "Arizona",
"05": "Arkansas",
"06": "California",
"08": "Colorado",
"09": "Connecticut",
"10": "Delaware",
"11": "District of Columbia",
JoeGermuska /
Last active June 17, 2020 03:22
A cross-reference of ZCTAs by state, and how it was made

A question came up on the US Census slack, leading to the recognition that the US Census Bureau API doesn't support queries for data for "all ZCTAs in a state". Nothing about the Census Bureau's definition of ZCTA requires that they be contained within a single state, which is probably why the API rejects the query with a message, error: unknown/unsupported geography heirarchy.

I've been looking for a general method to answer these kinds of questions for a long time. This Gist demonstrates a workable approach. It's based on data published by the Census LEHD LODES program, which provides, for every Census block in the US, a crosswalk indicating which geographies that block is in. (The set of geographies is limited but still very useful. See the technical doc PDF for more details.)

For any two geography types, one can simply select those two columns from the crosswalk and eliminate dupli

const IdyllComponent = require('idyll-component');
const http = require('https');
const jsonPromise = ((url) => {
// via
// return new pending promise
return new Promise((resolve, reject) => {
const request = http.get(url, (response) => {
JoeGermuska / timeline-unminified.css
Created May 20, 2016 15:38
Unminified Timeline CSS for easier reading
TimelineJS - ver. 3.3.14 - 2016-03-22
Copyright (c) 2012-2015 Northwestern University
a project of the Northwestern University Knight Lab, originally created by Zach Wise
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0.
If a copy of the MPL was not distributed with this file, You can obtain one at
Timeline JS 3
JoeGermuska /
Last active March 12, 2016 22:33
Fixing Keybase/GPG 'Legacy key' errors

I had used keybase successfully on my Mac. After some time, I went to track someone and got this error:

error: `gpg` exited with code 2
warn: gpg: error reading key: Legacy key

I had no luck googling for answers, but a @mtigas suggested that I try

gpg2 --list-secret-keys

That revealed a very old dsa1024 key I had installed (expired in 2005). I tried

JoeGermuska / b01003_140_10_moe.csv
Created November 18, 2015 04:01
ACS 2013-5 year table B01003 for all tracts in US
We can't make this file beautiful and searchable because it's too large.