Skip to content

Instantly share code, notes, and snippets.

View austinbrian's full-sized avatar

Brian Austin austinbrian

View GitHub Profile
@austinbrian
austinbrian / check_nulls.py
Last active December 31, 2018 20:29
Check df nulls
def check_nulls(df):
df_cols = df.columns
col_counts = [df[col].count() for col in df_cols]
col_lens = [len(df[col]) for col in df_cols]
cdf = pd.DataFrame(index = df_cols,
data = {'Values':col_counts,
'Total': col_lens})
cdf['% N/A'] = 1-(cdf['Values']/cdf.Total)
cdf['% N/A'] = cdf['% N/A'].map('{:.1%}'.format) # formats as percentages
return cdf
@austinbrian
austinbrian / set_axis_ticks.py
Last active October 31, 2018 20:53
If you have a matplotlib axis named "ax", this formats tick marks to put commas in
# from https://stackoverflow.com/a/44444489/7471215
# puts commas in integers
ax.set_yticklabels(['{:,}'.format(int(x)) for x in ax.get_yticks().tolist()])
# formats percentages
ax.set_yticklabels(['{:.0%}'.format(x) for x in ax.get_yticks().tolist()])
@austinbrian
austinbrian / list_flattener.py
Last active February 16, 2021 14:16
Flatter list out of a nested list of lists
flat_list = [item for sublist in nested_list for item in sublist]
# if a list is variable, like [2,3,[2,3],[2,3]], you'll need a function
# this creates a generator that will do what you want
def flatten(lis):
for item in lis:
if isinstance(item, list) and not isinstance(item, str):
for x in flatten(item):
yield x
else:
@austinbrian
austinbrian / vcxsrv.sh
Created September 25, 2018 16:26
If using windows, this allows multi-terminal and clipboard support from VcXsrv
# multiwindow
"C:\Program Files\VcXsrv\vcxsrv.exe" :0 -ac -terminate -lesspointer -multiwindow -clipboard -wgl -dpi auto
@austinbrian
austinbrian / get_data.py
Created August 27, 2018 18:47
Get pandas df from public github rawusercontent
import pandas as pd
import requests
from io import BytesIO
# using a 538 dataset as an example
url = 'https://raw.githubusercontent.com/fivethirtyeight/data/master/bob-ross/elements-by-episode.csv'
response = requests.get(url)
content = BytesIO(response.content)
df = pd.read_csv(content)
@austinbrian
austinbrian / np_counts.py
Created August 15, 2018 14:49
Quick summary of numpy counts into dict
a = numpy.array([0, 3, 0, 1, 0, 1, 2, 1, 0, 0, 0, 0, 1, 3, 4])
unique, counts = numpy.unique(a, return_counts=True)
dict(zip(unique, counts)) # returns {0: 7, 1: 4, 2: 1, 3: 2, 4: 1}
@austinbrian
austinbrian / pandas_settings.py
Last active April 27, 2020 16:15
A couple of useful settings for displaying pandas dataframes
# Pandas settings to include on import
import pandas as pd
import numpy as np
pd.set_option('display.max_rows',1000)
pd.set_option('display.max_columns',1000)
# Includes commas in outputs > 1,000, and formats as integers if integers
# If not integers, formats to two decimal places
pd.set_option('display.float_format', lambda x: "{:,.0f}".format(x) if x.is_integer()
else "{:,.2f}".format(x))
@austinbrian
austinbrian / 0_reuse_code.js
Created April 1, 2017 16:41
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console