Skip to content

Instantly share code, notes, and snippets.

View dusenberrymw's full-sized avatar

Mike Dusenberry dusenberrymw

View GitHub Profile
@dusenberrymw
dusenberrymw / tf_synth_train_estimator.py
Created June 28, 2018 05:27
Example ImageNet-style resnet training scenario with synthetic data and using the tf.Estimator API
"""Example ImageNet-style resnet training scenario with synthetic data.
Author: Mike Dusenberry
"""
import argparse
import sys
import tensorflow as tf
@dusenberrymw
dusenberrymw / tf_synth_train.py
Last active March 22, 2018 04:49
Example ImageNet-style resnet training scenario with synthetic data
"""Example ImageNet-style resnet training scenario with synthetic data.
Author: Mike Dusenberry
"""
import argparse
import numpy as np
import tensorflow as tf
# args
@dusenberrymw
dusenberrymw / latex_tips_and_tricks.md
Last active February 10, 2018 02:04
LaTeX Tips & Tricks

LaTeX Tips & Tricks

Continuous compilation

latexmk -pdf -pvc myfile.tex

In neovim, the following command will open up a separate terminal in a small split window to compile the current file: :sp | resize 5 | term latexmk -pdf -pvc %

Best practices

  • Always include usepackage[utf8]{inputenc} within every document.
import numpy as np
# the function
def f_of_x(X, w):
n,d = X.shape
X_dot_w = np.dot(X,w)
y = np.zeros(n)
# the inner product randomly goes through a sin
# or a cos
cos_flag = np.random.randn(n) < 0.0
@dusenberrymw
dusenberrymw / gist:99afd048e93629d02c8d0b48fdebebad
Last active April 5, 2018 23:01 — forked from tonyc/gist:1384523
Using strace, lsof, pyflame, and flamegraph.pl

Using strace and lsof to debug blocked processes

You can use strace on a specific pid to figure out what a specific process is doing, e.g.:

strace -fp <pid>

You might see something like:

select(9, [3 5 8], [], [], {0, 999999}) = 0 (Timeout)

@dusenberrymw
dusenberrymw / ml_dl_scenarios.md
Last active January 3, 2024 07:14
Interesting Machine Learning / Deep Learning Scenarios

Interesting Machine Learning / Deep Learning Scenarios

This gist aims to explore interesting scenarios that may be encountered while training machine learning models.

Increasing validation accuracy and loss

Let's imagine a scenario where the validation accuracy and loss both begin to increase. Intuitively, it seems like this scenario should not happen, since loss and accuracy seem like they would have an inverse relationship. Let's explore this a bit in the context of a binary classification problem in which a model parameterizes a Bernoulli distribution (i.e., it outputs the "probability" of the true class) and is trained with the associated negative log likelihood as the loss function (i.e., the "logistic loss" == "log loss" == "binary cross entropy").

Imagine that when the model is predicting a probability of 0.99 for a "true" class, the model is both correct (assuming a decision threshold of 0.5) and has a low loss since it can't do much better for that example. Now, imagine that the model

@dusenberrymw
dusenberrymw / 1.rsync_tips_and_tricks.md
Last active December 20, 2022 13:27
Rsync Tips & Tricks

Rsync Tips & Tricks

  • rsync -auzPhv --delete --exclude-from=rsync_exclude.txt SOURCE/ DEST/ -n
    • -a -> --archive; recursively sync, preserving symbolic links and all file metadata
    • -u -> --update; skip files that are newer on the receiver; sometimes this is inaccurate (due to Git, I think...)
    • -z -> --compress; compression
    • -P -> --progress + --partial; show progress bar and resume interupted transfers
    • -h -> --human-readable; human-readable format
    • -v -> --verbose; verbose output
  • -n -> --dry-run; dry run; use this to test, and then remove to actually execute the sync
@dusenberrymw
dusenberrymw / git_tips_and_tricks.md
Last active June 26, 2017 19:02
Git Tips & Tricks

Solid Git PR Contributor Workflow

A solid Git pull request workflow will keep you from having issues when contributing work to projects of interest. At the core, the idea is simple: keep a local master branch simply as a means of getting the latest official updates from the project's official Git repo so that you can create new branches from it to work on your desired changes. Then, always open PRs from these new branches, and once the PR is merged into the official Git repo, you can simply move back to master, pull those official changes, and then checkout a brand new branch for the next item you wish to work on.

@dusenberrymw
dusenberrymw / numpy_memory.py
Created April 14, 2017 20:22
NumPy memory address experiments
# NumPy memory experiments
import numpy as np
a = np.random.rand(1, 2, 3)
b = np.asarray(a) # no copy!
c = np.array(a) # copy!
# The array interface is a map that has a 'data' key returning
# a tuple containing a pointer to the memory address and a
# return flag.
@dusenberrymw
dusenberrymw / grep_tips_and_tricks.md
Last active August 24, 2017 01:44
Grep Tips & Tricks

Grep Tips & Tricks

  • grep -aisElr 'search string here' ./*
    • -a -> search files as strings (this finds strings in note annotations)
    • -i -> ignore case
    • -s -> suppress errors
    • -E -> use extended regular expressions
    • -l -> only show the filename(s); remove this to see the line where the search string was found
    • -r -> search recursively
  • Use :grep for Grep searching within Vim.