Skip to content

Instantly share code, notes, and snippets.

View austinbrian's full-sized avatar

Brian Austin austinbrian

View GitHub Profile
@austinbrian
austinbrian / extract_git_config.py
Created February 1, 2024 17:12
Turn the values of git config into a python dict
import subprocess
# this should work in any directory, whether a git repo or not
git_dict = dict(
x.split('=')
for x in
subprocess.check_output(
['git','config','--list']
).decode('ascii').strip().split('\n'))
import contextlib
import os
import tempfile
half_lambda_memory = 10**6 * (
int(os.getenv('AWS_LAMBDA_FUNCITON_MEMORY_SIZE', '0')) / 2)
@contextlib.contextmanager
@austinbrian
austinbrian / enable_docker.sh
Last active October 21, 2022 13:45
Restart docker and enable for ec2
#Check the status:
sudo service docker status
# If there isn’t running:
sudo service docker start
# And then to auto-start after reboot:
sudo systemctl enable docker
#Don’t forget to add your ec2-user to the docker group:
sudo usermod -aG docker ec2-user
# And then reboot or just logoff/login again
$ sudo reboot
@austinbrian
austinbrian / Duplicate DataFrame Rows.ipynb
Created February 16, 2021 15:14
If you have a set of rows you would like to repeat x number of times, here is a demonstration of how to use np.repeat to achieve it.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@austinbrian
austinbrian / Makefile
Created January 19, 2020 17:12
Basic cookiecutter datascience makefile
.PHONY: clean data lint requirements sync_data_to_s3 sync_data_from_s3
#################################################################################
# GLOBALS #
#################################################################################
PROJECT_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
BUCKET = <s3 bucket name>
PROFILE = <aws profile name>
PROJECT_NAME = <project name>
@austinbrian
austinbrian / get_file_over_boto3.py
Created September 12, 2019 18:22
Get a flat file into a pandas DataFrame
import pandas as pd
import boto3
s3 = boto3.resource('s3')
data_file = s3.Object(bucket_name='challenge-1-data-09122019', key='data/wiki_movie_plots_deduped.csv')
resp = data_file.get()
data = pd.read_csv(resp['Body'])
@austinbrian
austinbrian / DataScience Example Code.ipynb
Last active October 15, 2019 19:49
Example Code for Data Science metrics and descriptive statistics
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@austinbrian
austinbrian / s3_bucket.py
Created May 31, 2019 19:56
Wrapper around botocore that allows for a bucket to be easily imported and iterated through
"""
Data access utilities
"""
from collections.abc import Mapping
import os
import boto3
import botocore.client
class Bucket(Mapping):
@austinbrian
austinbrian / Five-year\ IPO.ipynb
Created May 1, 2019 21:38
An examination of a tweet about IPOs five years on
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@austinbrian
austinbrian / label_counts.cypher
Created April 17, 2019 21:24
Cypher -- Get label counts
match (n)
unwind labels(n) as s
return distinct s, count(s)