Skip to content

Instantly share code, notes, and snippets.

View JoeGermuska's full-sized avatar

Joe Germuska JoeGermuska

View GitHub Profile
@JoeGermuska
JoeGermuska / README.md
Last active August 18, 2022 21:00
Chicago Community Areas with region and key neighborhoods, and 2020 population by race
View README.md

I had reason to want a list of Chicago Community Areas annotated to assign each to a region of the city. I couldn't find it in structured form, so I built it.

I used this research guide from Harold Washington College Libraries, which assigned each community area to a region and listed key neighborhoods. I simplified "Central, Near North, and Near South Side" to just "Central". I also simplified "West and Near West Side" to just "West."

Since part of my project included aligning population numbers, the CSV in this gist also includes a simplified version of the 2020 Decennial Census redistricting table P2, "Hispanic or Latino, and not Hispanic or Latino by Race," as downloaded from Census Reporter. The simplifications were to rename columns, and to omit all of the detailed columns for "two or more races." I also dropped the 2010 and percentage change columns, and computed the per

@JoeGermuska
JoeGermuska / encoding_fixup.sql
Last active September 8, 2022 19:41
SQL to fix importing Latin-1 text as if it were UTF-8
View encoding_fixup.sql
-- Sometimes one accidentally loads data that is in ISO8859-1 (aka "Latin-1") encoding having assumed that it was actually UTF-8
-- so far it seems like à is a good flag although if your data might also have that correctly, this is less simple...
update tiger2020.census_name_lookup
set simple_name = replace(simple_name, 'ñ', 'ñ' ),
display_name = replace(display_name, 'ñ', 'ñ' ),
prefix_match_name = replace(prefix_match_name, 'ñ', 'ñ' );
update tiger2020.census_name_lookup
set simple_name = replace(simple_name, 'ü', 'ü' ),
display_name = replace(display_name, 'ü', 'ü' ),
prefix_match_name = replace(prefix_match_name, 'ü', 'ü' );
@JoeGermuska
JoeGermuska / 00_README.md
Last active December 14, 2023 19:55
Read Census 2020 PL94-171 ("redistricting") files into Pandas DataFrames
View 00_README.md

Some quick work to facilitate reading data for the Census 2020 PL94-171 data release into Pandas dataframes.

Sample data for Providence County, RI can be downloaded from https://www.census.gov/programs-surveys/decennial-census/about/rdo/summary-files.html, as can auxiliary materials.

The file headers.py was created by parsing the SAS import scripts from the link above.

It seems as though the Census Bureau removed the sample data for Providence County, RI, against which this code was tested. You can get a copy of it from http://files.censusreporter.org/ri2018_2020Style.pl.zip

Note: The full data release is now available at https://www2.census.gov/programs-surveys/decennial/2020/data/01-Redistricting_File--PL_94-171/

@JoeGermuska
JoeGermuska / README.md
Created September 10, 2020 22:22
Get last commit dates for remote git branches
View README.md

I had a repo with dozens of abandoned or merged remote branches. Before deleting them wholesale, I wanted to know just how old they were.

I adapted this Stack Overflow answer, but dressed it up in a for loop since I wanted all of them. Also, since I wanted to sort them, I switched the date format to %cs (Short ISO), which didn't work in git 2.18.0 but does work in 2.28.0

@JoeGermuska
JoeGermuska / fipsToState.json
Last active August 13, 2020 20:34 — forked from wavded/fipsToState.json
State FIPS JSON
View fipsToState.json
{
"01": "Alabama",
"02": "Alaska",
"04": "Arizona",
"05": "Arkansas",
"06": "California",
"08": "Colorado",
"09": "Connecticut",
"10": "Delaware",
"11": "District of Columbia",
@JoeGermuska
JoeGermuska / 01_readme.md
Last active June 17, 2020 03:22
A cross-reference of ZCTAs by state, and how it was made
View 01_readme.md

A question came up on the US Census slack, leading to the recognition that the US Census Bureau API doesn't support queries for data for "all ZCTAs in a state". Nothing about the Census Bureau's definition of ZCTA requires that they be contained within a single state, which is probably why the API rejects the query with a message, error: unknown/unsupported geography heirarchy.

I've been looking for a general method to answer these kinds of questions for a long time. This Gist demonstrates a workable approach. It's based on data published by the Census LEHD LODES program, which provides, for every Census block in the US, a crosswalk indicating which geographies that block is in. (The set of geographies is limited but still very useful. See the technical doc PDF for more details.)

For any two geography types, one can simply select those two columns from the crosswalk and eliminate dupli

View census-profile.js
const IdyllComponent = require('idyll-component');
const http = require('https');
const jsonPromise = ((url) => {
// via https://www.tomas-dvorak.cz/posts/nodejs-request-without-dependencies/
// return new pending promise
return new Promise((resolve, reject) => {
const request = http.get(url, (response) => {
@JoeGermuska
JoeGermuska / timeline-unminified.css
Created May 20, 2016 15:38
Unminified Timeline CSS for easier reading
View timeline-unminified.css
/*
TimelineJS - ver. 3.3.14 - 2016-03-22
Copyright (c) 2012-2015 Northwestern University
a project of the Northwestern University Knight Lab, originally created by Zach Wise
https://github.com/NUKnightLab/TimelineJS3
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0.
If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
*/
/*!
Timeline JS 3
@JoeGermuska
JoeGermuska / info.md
Last active March 12, 2016 22:33
Fixing Keybase/GPG 'Legacy key' errors
View info.md

I had used keybase successfully on my Mac. After some time, I went to track someone and got this error:

error: `gpg` exited with code 2
warn: gpg: error reading key: Legacy key

I had no luck googling for answers, but a @mtigas suggested that I try

gpg2 --list-secret-keys

That revealed a very old dsa1024 key I had installed (expired in 2005). I tried

@JoeGermuska
JoeGermuska / b01003_140_10_moe.csv
Created November 18, 2015 04:01
ACS 2013-5 year table B01003 for all tracts in US
View b01003_140_10_moe.csv
We can't make this file beautiful and searchable because it's too large.
geoid,b01003001,b01003001_moe
14000US02013000100,3141,0
14000US02016000100,1176,192
14000US02016000200,4362,192
14000US02020000101,6215,631
14000US02020000102,5169,760
14000US02020000201,4127,413
14000US02020000202,6136,547
14000US02020000203,11099,628
14000US02020000204,3956,353