Skip to content

Instantly share code, notes, and snippets.

View scottpham's full-sized avatar

Scott Pham scottpham

View GitHub Profile
@scottpham
scottpham / snippets.py
Created November 15, 2022 21:07
snippets for mining project
# helper for counting basic sums
def basic_summary(frame):
return frame.pipe( lambda f: pd.Series({
"cases": f["CASE NUMBER"].nunique(),
"companies": f["COMPANY NAME"].nunique(),
"mines": f["MINE_ID"].nunique(),
"permits": f["PERMIT_NUMBER"].nunique()
}))
@scottpham
scottpham / instructions.md
Created August 5, 2022 21:41
How to install image magick for pdf plumber

The Magic Wand Library doesn't work yet (as of 08/05/2022) with ImageMagick7. Download 6 via homebrew:

brew install imagemagick@6

It's not enough just to link. Find the executables by searching brew --prefix. It'll be in Cellar.

Grab the main directory of the whole package and write to zshrc

export MAGICK_HOME=/opt/homebrew/Cellar/imagemagick@6/6.9.12-60

@scottpham
scottpham / stats.py
Last active May 24, 2021 22:15
How to hypothesis test in Python
import scipy.stats as stats
import numpy as np
import scipy.stats.distributions as dist
import statsmodels.api as sm
# calc by hand
va = prop_sub * (1 - prop_sub) # variance
se = np.sqrt(va * (1 / total_high + 1/total_mod)) # grouped standard error
@scottpham
scottpham / apply_groupby.py
Created April 20, 2021 22:41
How to use apply with groupby in pandas
# grooop
grouped = with_region.groupby('level_comp_region')
# How many workers are in which
res = (
grouped
.apply( lambda grp: pd.Series({
"old_min": grp["old_min"].iloc[0],
"old_max": grp["old_max"].iloc[0],
"employees here": len(grp)
@scottpham
scottpham / census.py
Last active March 8, 2021 18:37
How to download census data in Python
import censusdata
import pandas as pd
# https://towardsdatascience.com/accessing-census-data-with-python-3e2f2b56e20d
# https://jtleider.github.io/censusdata/
# search for the right table
sample = censusdata.search('acs1', 2019, 'concept', 'total population')
sample[0]
@scottpham
scottpham / sodapy.py
Last active June 17, 2020 17:54
[How to use the socrata API and python to download CMS data] #python
from sodapy import Socrata
import pandas as pd
# Endpoint: https://data.cms.gov/resource/s2uc-8wxp.json
set_id = "s2uc-8wxp"
client = Socrata("data.cms.gov/", None)
results = client.get(

Keybase proof

I hereby claim:

  • I am scottpham on github.
  • I am scottpham (https://keybase.io/scottpham) on keybase.
  • I have a public key ASDsdeZ6p9Iov0iBknijUhZN28xntLHOSL2ZEpy7aenyDQo

To claim this, I am signing this object:

@scottpham
scottpham / headless_scrape.py
Last active September 19, 2019 18:34
headless_scraping
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium
from time import sleep
# remember to pip install lxml
# boot up the headless chrome
def start_driver():
# create an options object
options = webdriver.ChromeOptions()
@scottpham
scottpham / altair.py
Last active November 16, 2017 21:16
altair configuration to make pretty charts
# Make sure to get all of the methods:
from altair import *
# set up some color vars:
rv_orange="#f38c21"
rv_red="#D24435"
rv_blue="#0099FF"
rv_green="#00B37C"
rv_orange_dark="#AA5C09"
@scottpham
scottpham / .block
Created May 31, 2017 21:24
Standalone Line Chart
license: mit