Skip to content

Instantly share code, notes, and snippets.

View petered's full-sized avatar

Peter O'Connor petered

  • eagleeyessearch.com
View GitHub Profile
@petered
petered / unbiased-online-recurrent-optimization
Last active August 25, 2017 12:57
2017-08-25 Unbiased Online Recurrent Optimization
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$
$\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}$
$\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}$
$\newcommand{\numel}[1]{|#1|}$
$\newcommand{\pderivdim}[2]{\overset{\big[\numel {#1} \times \numel {#2} \big]}{\frac{\partial #1}{\partial #2}}}$
$\newcommand{\pderivdimg}[4]{\overset{\big[#3 \times #4 \big]}{\frac{\partial #1}{\partial #2}}}$
@petered
petered / Distributed Parameter Tuning
Created August 29, 2017 12:26
2017-08-29 Parameter Tuning
# Distributed Low-Bit Computation
Suppose we're trying to communicate a scalar parameter $\theta$ from a worker $W$ to a server $S$.
$\theta$ changes with time $t$. The worker simply communicates bits of theta asynchronously - so if it sends a bit $b\in {0, 1}$ at time $t\in \mathbb I^+$ we say that the worker communicated a message $(b, t)$. If the worker sends M messages between times $t_1$ and $t_2$, we say $N_{t_1}^{t_2} = M$
The Server takes in these bits and uses them to build a distribution $p(\hat \theta)$ over the current value of theta.
**Can we create an encoding with the following properties?:**
@petered
petered / iterated-matrix-decomposition
Last active September 27, 2017 06:51
2017-09-26 Iterated Matrix Decomposition
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$
$\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}$
$\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}$
$\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}$
$\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}}$
$\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}}$
$\newcommand{blue}[1]{\color{blue}{#1}}$
$\newcommand{red}[1]{\color{red}{#1}}$
$\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}}$
$\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}}$
@petered
petered / kasper-project
Last active October 12, 2017 15:37
2017-10-12 Kasper
# 1) Simple Maximum Likelihood
F --> X
$$
p(F=1 | X=x) = \frac{p(X=x|F=1) p(F=1)}{p(X=x)} = \frac{p(X=x|F=1) p(F=1)}{p(X=x|F=0)p(F=0) + p(X=x|F=1)p(F=1)}
$$
@petered
petered / testgist
Created October 20, 2017 08:13
Temporal Networks
# Temporal Networks
$\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}$
# The idea
Let
$(x, y)$ be the input, target data, and
$u_1, ... u_L$ be the pre-nonlinearity activations of a neural network, and
$w_1, ... w_L$ be the parameters and $\cdot w (x) \triangleq x\cdot w$
$h_l(\cdot)$ be the nonlinearity of the $l'th$ layer, and
@petered
petered / generative-models-assignment
Created October 20, 2017 14:27
2017-10-20 DL Assignment: Generative Models
# Generative Models
## Introduction
Generative models are models that learn the *distribution* of the data.
Suppose we have a collection of N D-Dimensional points: $\{x_1, ..., x_N\}$. Each, $x_i$ might represent a vector of pixels in an image, or the words in a sentence.
In generative modeling, we imagine that these points are samples from a D-dimensional probability distribution. The distribution represents whatever real-world process was used to generate that data. Our objective is to learn the parameters of this distribution. This allows us to do things like
@petered
petered / dl-course-a1-math
Last active November 6, 2017 11:46
2017-11-03 Matrix Math
$$
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}}
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}
\newcommand{argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{argmin}[1]{\underset{#1}{\operatorname{argmin}}}
\newcommand{blue}[1]{\color{blue}{#1}}
\newcommand{red}[1]{\color{red}{#1}}
@petered
petered / kasper-em-on-graph
Created December 12, 2017 23:41
2017-12-12 Kasper EM
$$
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}}
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}
\newcommand{\blue}[1]{\color{blue}{#1}}
\newcommand{\red}[1]{\color{red}{#1}}
@petered
petered / fewfds
Created January 4, 2018 15:36
2017-11-20 Online Learning Update
$$
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}}
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}
\newcommand{\blue}[1]{\color{blue}{#1}}
\newcommand{\red}[1]{\color{red}{#1}}
@petered
petered / low-var-online-learning
Last active February 15, 2018 15:22
2018-01-17 Lower-Variance Online Gradient Estimates
$$
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}}
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}
\newcommand{\blue}[1]{\color{blue}{#1}}
\newcommand{\red}[1]{\color{red}{#1}}