Skip to content

Instantly share code, notes, and snippets.

@ychen404
ychen404 / ds-project-organization.md
Created May 14, 2024 18:51 — forked from ericmjl/ds-project-organization.md
How to organize your Python data science project

UPDATE: I have baked the ideas in this file inside a Python CLI tool called pyds-cli. Please find it here: https://github.com/ericmjl/pyds-cli

How to organize your Python data science project

Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.

Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.

Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!

@ychen404
ychen404 / Select_CIFAR10_Classes.py
Created May 12, 2021 08:19 — forked from Miladiouss/Select_CIFAR10_Classes.py
Create PyTorch datasets and dataset loaders for a subset of CIFAR10 classes.
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
from torch.utils.data import Dataset, DataLoader
import numpy as np
# Transformations
RC = transforms.RandomCrop(32, padding=4)
RHF = transforms.RandomHorizontalFlip()
RVF = transforms.RandomVerticalFlip()
@ychen404
ychen404 / min-char-rnn.py
Created November 9, 2018 04:02 — forked from karpathy/min-char-rnn.py
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
@ychen404
ychen404 / Output
Created October 30, 2016 03:32 — forked from Circuitsoft/Output
Capture single image V4L2
Driver Caps:
Driver: "omap3"
Card: "omap3/mt9v032//"
Bus: ""
Version: 0.0
Capabilities: 04000001
Camera Cropping:
Bounds: 752x480+0+0
Default: 752x480+0+0
Aspect: 1/1