Skip to content

Instantly share code, notes, and snippets.

@cjlovering
cjlovering / vscode-auto-format.md
Created December 8, 2020 17:44
Auto-format in visual studio code on save.

This lets you have consistently formatted code with no additional effort (manual labor of fixing it yourself, adding commit hooks, or running a formatter from the command line.) I use the black code-formatter, but you could find another one.

  1. Install black in whatever python interpreter you are using for visual studio code: pip install black .
  2. Navigate: Preferences --> Settings.
  3. Find and set: Format provider --> black.
  4. Open settings.json by finding "edit settings.json in the settings pane". Just scroll around and you'll find it.
  5. Add a line: "editor.formatOnSave": true.
@cjlovering
cjlovering / pretty-pandas-notebooks.md
Last active December 8, 2020 17:42
Pretty-print pandas in notebooks!

When in a jupyter notebook, using display lets you see the pandas table pretty-printed/html-formatted (as if it was the last return in a block).

def func(args):
  ...
  data # pd.DataFrame
  display(data)
  ...
@cjlovering
cjlovering / index-3d.md
Created December 8, 2020 17:40
Index into a 3D tensor (pytorch).

This lets you index into a 3D tensor and select a subset of the vectors. For example, say you wanted to select the final output from sequential data of different lengths that was packed together.

Set the indices to be the lengths of each sequence in the batch.

(Note: Normally you can use pack/unpack in pytorch, but this requires you to use their implementations of RNNs and this does not yet work with transformers.)

batch_size, seq_len, embed_dim = output.size()
selected = output[
	torch.arange(batch_size),
@cjlovering
cjlovering / defaultdict.py
Created December 8, 2020 17:37
Build a DefaultDict directly.
from collections import DefaultDict
x = {"1": 1, "2": 2}
DefaultDict(int, x)
@cjlovering
cjlovering / seaborn-style.md
Last active December 8, 2020 18:40
Fix "too many values" for seaborn.

If you use seaborn and have too many values for the style (more than 6), add dashes=False and set the markers:

filled_markers = ('o', 'v', '^', '<', '>', '8', 's', 'p', '*', 'h', 'H', 'D', 'd', 'P', 'X')
sns.lineplot(data=df, markers=filled_markers, dashes=False)

See mwaskom/seaborn#1513 (comment).

@cjlovering
cjlovering / haiku-param-merge.py
Created December 8, 2020 17:31
Haiku Merge Model Parameters
import haiku as hk
def merge_pretrained_params(new_params: hk.Params, pre_params: hk.Params) -> hk.Params:
"""Merges pre-trained `pre_params` parameters into new_parameters `new_params`.
The names of the pre_params and new_params are (a) selected intentionally
or otherwise (b) the reused modules are called before new modules
s.t. that they end up with the same names.
"""
# Filter out the parameters from the pre-trained model that aren't used
@cjlovering
cjlovering / too-many-files.md
Last active December 8, 2020 17:33
Too many files

If a bash command (cp, mv, rename) doesn't work cause there are too many files, instead:

find DIR_PATH -name "*.ext" -exec COMMAND {} \;

For example: find /home/ftpuser/public_html/ftparea/ -name "*.jpg" -exec cp -uf "{}" /your/destination \;

See https://serverfault.com/a/56071.

@cjlovering
cjlovering / interacting_git.md
Last active December 8, 2020 17:37
Interacting with git nicely on remote grids.

git diff

If you have trouble with git diffs, e.g. you don't see color or see a lot of ESCs, use the following:

export LESS=eFRX

See https://stackoverflow.com/a/20414664/5419413.

git commit

If you have trouble with commiting files, e.g. it fails to let you enter a full message, use the following to set the editor to vim:

@cjlovering
cjlovering / iterations.md
Created April 24, 2018 22:15
Accurately get the number of batches.
for batch in range(num_data // batch_size + (num_data % batch_size > 0)):

Ever run into errors when training models that have to do with running one too many or two few batches?

This snippet should fix it.

  1. First get the number of iterations that the batch fits cleanly in the number of data points. E.g. num_data // batch_size.
  2. Add 1 if there are remaining data points. E.g. (num_data % batch_size > 0)1.
@cjlovering
cjlovering / compile_latex.md
Last active April 24, 2018 22:15
Compile a latex file with a bibliography.

For some time I have blindly run latex commands until I got a PDF file with the correct references. After reading some source and SF posts, I believe I have figured out the correct way to do it.

latexmk -c main.tex; pdflatex main.tex; bibtex main; pdflatex main.tex;
  1. latexmk -c main.tex cleans the directory.
  2. pdflatex main.tex generates auxillary files and figures out what bib entries it needs.
  3. bibtex main compiles the bibliography.
  4. pdflatex main.tex generates the final PDF.