Skip to content

Instantly share code, notes, and snippets.

Avatar
🎯
Focusing

Eric Ma ericmjl

🎯
Focusing
View GitHub Profile
@ericmjl
ericmjl / ds-project-organization.md
Created Jun 12, 2018
How to organize your Python data science project
View ds-project-organization.md

How to organize your Python data science project

Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.

Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.

Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!

Disclaimer 3: I found the Cookiecutter Data Science page after finishing this blog post. Many ideas overlap here, though some directories are irrelevant in my work -- which is to

@ericmjl
ericmjl / merger.py
Created Jun 5, 2015
A Python script for merging PDF files together.
View merger.py
"""
Author: Eric J. Ma
Purpose: To merge PDFs together in an automated fashion.
"""
import os
from PyPDF2 import PdfFileReader, PdfFileMerger
@ericmjl
ericmjl / september-2020-newsletter.md
Last active Sep 7, 2020
Data Science Programming September 2020 Newsletter
View september-2020-newsletter.md

Data Science Programming September 2020 Newsletter

Hello fellow datanistas!

Welcome to the September edition of the programming-oriented data science newsletter. I hope you've all been staying safe amid the COVID-19 outbreak.

There's no special theme this month, just a smattering of cool tools and articles that I think will improve your productivity!

Setting up VSCode for Python Development like RStudio

View issue-59-debugging.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ericmjl
ericmjl / install_anaconda.sh
Created Jul 9, 2020
A script to install Anaconda on a new system
View install_anaconda.sh
# Taken from https://github.com/ericmjl/dotfiles/blob/master/install_functions.sh
function install_anaconda {
bash anaconda.sh -b -p $HOME/anaconda
rm anaconda.sh
export PATH=$HOME/anaconda/bin:$PATH
# Install basic data science stack into default environment
conda install --yes pandas scipy numpy matplotlib seaborn jupyter ipykernel nodejs
View troubleshooting-batch-runner.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View gist:384d54262cfe7b81c86263ee28822a0c
$ docker run --gpus all -i -t jax:latest /bin/bash
(base) [docker@e697aef58065 ~]$ ls
anaconda cuda-repo-rhel8-10-2-local-10.2.89-440.33.01-1.0-1.x86_64.rpm
(base) [docker@e697aef58065 ~]$ which python
~/anaconda/bin/python
(base) [docker@e697aef58065 ~]$ conda activate mouse-hmm
(mouse-hmm) [docker@e697aef58065 ~]$ python
Python 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
@ericmjl
ericmjl / test_d_separation.py
Last active May 30, 2020
Proposed change to d-separation tests based on pytest functions and fixtures.
View test_d_separation.py
@pytest.fixture
def path_graph():
"""Return a path graaph of length three."""
G = nx.path_graph(3, create_using=nx.DiGraph)
G.graph["name"] = "path"
nx.freeze(G)
return G
@pytest.fixture
@ericmjl
ericmjl / app.py
Created May 20, 2020
A gist copy of my PyCon talk.
View app.py
import streamlit as st
import pandas as pd
st.title("A Careful Walk Through Probability Distributions with Python")
st.markdown("""
_By Eric J. Ma, for PyCon 2020_
Hey there! Thanks for stopping by.
View inplace_transform-column.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.