Skip to content

Instantly share code, notes, and snippets.

@uhjish
uhjish / parse_mortality_data.py
Created August 29, 2017 08:13 — forked from SohierDane/parse_mortality_data.py
CDC Mortality Dataset Preparation 2005-2015
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
For each year, parse the pdf manual, then use that information to
unpack the fixed-width data file.
Source data files can be found here:
https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm#Mortality_Multiple
Passes basic tests for 2005-2015. Untested on earlier years.
--
-- GEOIP IN POSTGRESQL
--
-- We use two approaches. First using PostgreSQL inet and cidr types and indexing (PostgreSQL 9.4 and later),
-- and then using ip4r (https://github.com/RhodiumToad/ip4r).
-- The performance of ip4r indexes is significantly better than PostgreSQL's own index.
-- An operation that took 42s using ip4r took 47 minutes using PostgreSQL's cidr index.
--
@uhjish
uhjish / install-Python-AmazonLinux-20171023.log
Created June 2, 2020 16:27 — forked from mrthomaskim/install-Python-AmazonLinux-20171023.log
Amazon Linux AMI, pyenv, virtualenv, Python, ... Hello, World!
### prerequisites
sudo yum groupinstall "Development Tools"
git --version
gcc --version
bash --version
python --version # (system)
sudo yum install -y openssl-devel readline-devel zlib-devel
sudo yum update
### install `pyenv`