All your notes, scripts, config files and snippets deserve version control and tagging!
gist
is a simple bash script for gist management.
It is lightweight(~700LOC) and dependency-free! Helps you to boost coding workflow.
import lineax as lx | |
import jax.numpy as jnp | |
import jax | |
from jaxtyping import Float, Array | |
class CubicSpline: | |
x_grid: Float[Array, str("batch")] # input x data | |
y_grid: Float[Array, str("n")] # input y data |
Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.
Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.
Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!
Disclaimer 3: I found the Cookiecutter Data Science page after finishing this blog post. Many ideas overlap here, though some directories are irrelevant in my work -- which is to
""" | |
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy) | |
BSD License | |
""" | |
import numpy as np | |
# data I/O 输入训练数据 | |
data = open('input.txt', 'r').read() # should be simple plain text file | |
chars = list(set(data)) | |
data_size, vocab_size = len(data), len(chars) # 字符数目和单词数目 |