Skip to content

Instantly share code, notes, and snippets.

@armgilles
Last active August 29, 2015 14:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save armgilles/3ff61a891757574feef2 to your computer and use it in GitHub Desktop.
Save armgilles/3ff61a891757574feef2 to your computer and use it in GitHub Desktop.
List of different pattern. How to use it & why

List of partern for neuronal network :

  1. Autoencoders are simplest ones. They are intuitively understandable, easy to implement and to reason about (e.g. it's much easier to find good meta-parameters for them than for RBMs).
  2. RBMs are generative. That is, unlike autoencoders that only discriminate some data vectors in favour of others, RBMs can also generate new data with given joined distribution. They are also considered more feature-rich and flexible.
  3. CNNs are very specific model that is mostly used for very specific task (though pretty popular task). Most of the top-level algorithms in image recognition are somehow based on CNNs today, but outside that niche they are hardly applicable (e.g. what's the reason to use convolution for film review analysis?).

Autoencoder

Autoencoder is a simple 3-layer neural network where output units are directly connected back to input units. E.g. in a network like this

image

output[i] has edge back to input[i] for every i. Typically, number of hidden units is much less then number of visible (input/output) ones. As a result, when you pass data through such a network, it first compresses (encodes) input vector to "fit" in a smaller representation, and then tries to reconstruct (decode) it back. The task of training is to minimize an error or reconstruction, i.e. find the most efficient compact representation (encoding) for input data.

An autoencoder is a neural network that is trained in an unsupervised fashion. The goal of an autoencoder is to find a more compact representation of the data by learning an encoder, which transforms the data to their corresponding compact representation, and a decoder, which reconstructs the original data

Restricted Boltzmann Machines (RBM) :

It shares similar idea of Autoencoder, but uses stochastic approach. Instead of deterministic (e.g. logistic or ReLU) it uses stochastic units with particular (usually binary of Gaussian) distribution. Learning procedure consists of several steps of Gibbs sampling (propagate: sample hiddens given visibles; reconstruct: sample visibles given hiddens; repeat) and adjusting the weights to minimize reconstruction error.

image

Intuition behind RBMs is that there are some visible random variables (e.g. film reviews from different users) and some hidden variables (like film genres or other internal features), and the task of training is to find out how these two sets of variables are actually connected to each other (more on this example may be found here.

Convolutional Neural Networks (CNN) :

Convolutional Neural Networks are somewhat similar to these two, but instead of learning single global weight matrix between two layers, they aim to find a set of locally connected neurons. CNNs are mostly used in image recognition. Their name comes from "convolution" operator or simply "filter". In short, filters are an easy way to perform complex operation by means of simple change of a convolution kernel. Apply Gaussian blur kernel and you'll get it smoothed. Apply Canny kernel and you'll see all edges. Apply Gabor kernel to get gradient features.

image

The goal of convolutional neural networks is not to use one of predefined kernels, but instead to learn data-specific kernels. The idea is the same as with autoencoders or RBMs - translate many low-level features (e.g. user reviews or image pixels) to the compressed high-level representation (e.g. film genres or edges) - but now weights are learned only from neurons that are spatially close to each other.

Interresting notes :

  • No need to do some dimensional reduction (as PCA) for Autoencoder & RBM. They both take a vector in n-dimensional space they translate it into an m-dimensional one, trying to keep as much important information as possible and, at the same time, remove noise (like a PCA).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment