Skip to content

Instantly share code, notes, and snippets.

View ericmjl's full-sized avatar
🎯
Focusing

Eric Ma ericmjl

🎯
Focusing
View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ericmjl
ericmjl / gp-test.ipynb
Created December 13, 2018 05:35
Extending GPs to 2 dimensions
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ericmjl
ericmjl / environment.yml
Created December 12, 2018 17:13
DL introductory hands-on workshop specfile
name: dl-workshop
channels:
- defaults
- conda-forge
- ericmjl
dependencies:
- python=3.7
- jupyter
- jupyterlab
- conda
@ericmjl
ericmjl / gp-test.ipynb
Created December 11, 2018 22:59
Doing GPs in numpy!
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ericmjl
ericmjl / holoviews_datashader.py
Created August 31, 2018 18:14
Holoviews dynamic map with datashader
import datashader as ds
import holoviews as hv
from holoviews.operation.datashader import datashade
hv.extension('bokeh')
def scatter(dim1, dim2):
def _scatter(data):
return hv.Scatter(data, kdims=[dim1], vdims=[dim2], extents=(-10, -10, 10, 10))
return _scatter
@ericmjl
ericmjl / variance_explained.py
Last active August 24, 2018 22:22
Difference between the my own implementation of explained variance and scikit-learn's
from sklearn.metrics import explained_variance_score
def var_explained(preds, actual):
"""
Implementation taken directly from the formula on this page:
http://scikit-learn.org/stable/modules/model_evaluation.html#explained-variance-score
"""
return 1 - ((preds - actual).var() / actual.var())
y_pred = np.array([3, -0.5, 2, 7])
@ericmjl
ericmjl / create_dir.py
Last active August 13, 2018 13:26
Python: create directory if it doesn't exist, using pathlib!
from pathlib import Path
import os
# We will use the example of creating a .directory under home.
home = Path.home()
dirname = home / '.dir'
if not dirname.exists():
os.mkdir(dirname)
@ericmjl
ericmjl / random_scalar_graph.py
Created August 1, 2018 21:55
Generate lots of random graphs with scalar features on each node.
import networkx as nx
import numpy as np
def generate_graph():
num_nodes = np.random.randint(low=3, high=20)
G = nx.erdos_renyi_graph(n=num_nodes, p=0.3)
for n in G.nodes():
value = np.random.randint(low=1, high=20)
G.node[n]['value'] = value
return G
@ericmjl
ericmjl / ds-project-organization.md
Last active April 21, 2024 16:48
How to organize your Python data science project

UPDATE: I have baked the ideas in this file inside a Python CLI tool called pyds-cli. Please find it here: https://github.com/ericmjl/pyds-cli

How to organize your Python data science project

Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.

Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.

Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!

@ericmjl
ericmjl / install_tmux.sh
Created June 1, 2018 12:33
A script to install Tmux on systems that you don't have root access
#!/bin/bash
# Script for installing tmux on systems where you don't have root access.
# tmux will be installed in $HOME/local/bin.
# It's assumed that wget and a C/C++ compiler are installed.
# exit on error
set -e
TMUX_VERSION=2.6