Skip to content

Instantly share code, notes, and snippets.

@raidery
Created December 29, 2019 15:23
Show Gist options
  • Save raidery/f50cccf2750ce9500b86cce6202a5312 to your computer and use it in GitHub Desktop.
Save raidery/f50cccf2750ce9500b86cce6202a5312 to your computer and use it in GitHub Desktop.
Created on Cognitive Class Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://www.bigdatauniversity.com\"><img src=\"https://ibm.box.com/shared/static/qo20b88v1hbjztubt06609ovs85q8fau.png\" width=\"400px\" align=\"center\"></a>\n",
"\n",
"<h1 align=\"center\"><font size=\"5\">RESTRICTED BOLTZMANN MACHINES</font></h1>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Introduction</h3>\n",
"<b>Restricted Boltzmann Machine (RBM):</b> RBMs are shallow neural nets that learn to reconstruct data by themselves in an unsupervised fashion. \n",
"\n",
"\n",
"<h4>Why are RBMs important?</h4>\n",
"It can automatically extract <b>meaningful</b> features from a given input.\n",
"\n",
"\n",
"<h4>How does it work?</h4>\n",
"RBM is a 2 layer neural network. Simply, RBM takes the inputs and translates those into a set of binary values that represents them in the hidden layer. Then, these numbers can be translated back to reconstruct the inputs. Through several forward and backward passes, the RBM will be trained, and a trained RBM can reveal which features are the most important ones when detecting patterns. \n",
"\n",
"\n",
"<h4>What are the applications of RBM?</h4>\n",
"RBM is useful for <a href='http://www.cs.utoronto.ca/~hinton/absps/netflixICML.pdf'> Collaborative Filtering</a>, dimensionality reduction, classification, regression, feature learning, topic modeling and even <b>Deep Belief Networks</b>.\n",
"\n",
"\n",
"\n",
"<h4>Is RBM a generative or Discriminative model?</h4>\n",
"RBM is a generative model. Let me explain it by first, see what is different between discriminative and generative models: \n",
"\n",
"<b>Discriminative:</b> Consider a classification problem in which we want to learn to distinguish between Sedan cars (y = 1) and SUV cars (y = 0), based on some features of cars. Given a training set, an algorithm like logistic regression tries to find a straight line—that is, a decision boundary—that separates the suv and sedan. \n",
"<b>Generative:</b> looking at cars, we can build a model of what Sedan cars look like. Then, looking at SUVs, we can build a separate model of what SUV cars look like. Finally, to classify a new car, we can match the new car against the Sedan model, and match it against the SUV model, to see whether the new car looks more like the SUV or Sedan. \n",
"\n",
"Generative Models specify a probability distribution over a dataset of input vectors. We can do both supervise and unsupervised tasks with generative models:\n",
"<ul>\n",
" <li>In an unsupervised task, we try to form a model for P(x), where P is the probability given x as an input vector.</li>\n",
" <li>In the supervised task, we first form a model for P(x|y), where P is the probability of x given y(the label for x). For example, if y = 0 indicates whether a car is a SUV or y = 1 indicates indicate a car is a Sedan, then p(x|y = 0) models the distribution of SUVs’ features, and p(x|y = 1) models the distribution of Sedans’ features. If we manage to find P(x|y) and P(y), then we can use <code>Bayes rule</code> to estimate P(y|x), because: $$p(y|x) = \\frac{p(x|y)p(y)}{p(x)}$$</li>\n",
"</ul>\n",
"Now the question is, can we build a generative model, and then use it to create synthetic data by directly sampling from the modeled probability distributions? Lets see. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Table of Contents</h2>\n",
"<ol>\n",
" <li><a href=\"#ref1\">Initialization</a></li>\n",
" <li><a href=\"#ref2\">RBM layers</a></li>\n",
" <li><a href=\"#ref3\">What RBM can do after training?</a></li>\n",
" <li><a href=\"#ref4\">How to train the model?</a></li>\n",
" <li><a href=\"#ref5\">Learned features</a></li>\n",
"</ol>\n",
"<p></p>\n",
"</div>\n",
"<br>\n",
"\n",
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ref1\"></a>\n",
"<h3>Initialization</h3>\n",
"\n",
"First we have to load the utility file which contains different utility functions that are not connected\n",
"in any way to the networks presented in the tutorials, but rather help in\n",
"processing the outputs into a more understandable way."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import urllib.request\n",
"with urllib.request.urlopen(\"http://deeplearning.net/tutorial/code/utils.py\") as url:\n",
" response = url.read()\n",
"target = open('utils.py', 'w')\n",
"target.write(response.decode('utf-8'))\n",
"target.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we load in all the packages that we use to create the net including the TensorFlow package:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"FuelConsumption.csv\n",
"ML0120EN-1.1-Review-TensorFlow-Hello-World.ipynb\n",
"ML0120EN-1.2-Review-LinearRegressionwithTensorFlow.ipynb\n",
"ML0120EN-1.4-Review-LogisticRegressionwithTensorFlow.ipynb\n",
"ML0120EN-2.1-Review-Understanding_Convolutions.ipynb\n",
"ML0120EN-2.2-Review-CNN-MNIST-Dataset.ipynb\n",
"ML0120EN-3.1-Reveiw-LSTM-basics.ipynb\n",
"ML0120EN-3.2-Review-LSTM-LanguageModelling.ipynb\n",
"ML0120EN-4.1-Review-RBMMNIST.ipynb\n",
"MNIST_data\n",
"__pycache__\n",
"bird.jpg\n",
"num3.jpg\n",
"summary_logs\n",
"utils.py\n",
"utils1.py\n"
]
}
],
"source": [
"!ls"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"import numpy as np\n",
"from tensorflow.examples.tutorials.mnist import input_data\n",
"#!pip install pillow\n",
"from PIL import Image\n",
"from utils import tile_raster_images\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ref2\"></a>\n",
"<h3>RBM layers</h3>\n",
"\n",
"An RBM has two layers. The first layer of the RBM is called the <b>visible</b> (or input layer). Imagine that our toy example, has only vectors with 7 values, so the visible layer must have j=7 input nodes. \n",
"The second layer is the <b>hidden</b> layer, which possesses i neurons in our case. Each hidden node can have either 0 or 1 values (i.e., si = 1 or si = 0) with a probability that is a logistic function of the inputs it receives from the other j visible units, called for example, p(si = 1). For our toy sample, we'll use 2 nodes in the hidden layer, so i = 2.\n",
"\n",
"<center><img src=\"https://ibm.box.com/shared/static/eu26opvcefgls6vnwuo29uwp0nudmokh.png\" alt=\"RBM Model\" style=\"width: 400px;\"></center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" \n",
"\n",
"Each node in the first layer also has a <b>bias</b>. We will denote the bias as “v_bias” for the visible units. The <b>v_bias</b> is shared among all visible units.\n",
"\n",
"Here we define the <b>bias</b> of second layer as well. We will denote the bias as “h_bias” for the hidden units. The <b>h_bias</b> is shared among all hidden units"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"v_bias = tf.placeholder(\"float\", [7])\n",
"h_bias = tf.placeholder(\"float\", [2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have to define weights among the input layer and hidden layer nodes. In the weight matrix, the number of rows are equal to the input nodes, and the number of columns are equal to the output nodes. Let <b>W</b> be the Tensor of 7x2 (7 - number of visible neurons, 2 - number of hidden neurons) that represents weights between neurons. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"W = tf.constant(np.random.normal(loc=0.0, scale=1.0, size=(7, 2)).astype(np.float32))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ref3\"></a>\n",
"<h3>What RBM can do after training?</h3>\n",
"Think RBM as a model that has been trained based on images of a dataset of many SUV and Sedan cars. Also, imagine that the RBM network has only two hidden nodes, one for the weight and, and one for the size of cars, which in a sense, their different configurations represent different cars, one represent SUV cars and one for Sedan. In a training process, through many forward and backward passes, RBM adjust its weights to send a stronger signal to either the SUV node (0, 1) or the Sedan node (1, 0) in the hidden layer, given the pixels of images. Now, given a SUV in hidden layer, which distribution of pixels should we expect? RBM can give you 2 things. First, it encodes your images in hidden layer. Second, it gives you the probability of observing a case, given some hidden values.\n",
"\n",
"\n",
"<h3>How to inference?</h3>\n",
"\n",
"RBM has two phases:\n",
"<ul>\n",
" <li>Forward Pass</li> \n",
" <li>Backward Pass or Reconstruction</li>\n",
"</ul>\n",
"\n",
"<b>Phase 1) Forward pass:</b> Input one training sample (one image) <b>X</b> through all visible nodes, and pass it to all hidden nodes. Processing happens in each node in the hidden layer. This computation begins by making stochastic decisions about whether to transmit that input or not (i.e. to determine the state of each hidden layer). At the hidden layer's nodes, <b>X</b> is multiplied by a <b>$W_{ij}$</b> and added to <b>h_bias</b>. The result of those two operations is fed into the sigmoid function, which produces the node’s output, $p({h_j})$, where j is the unit number. \n",
"\n",
"\n",
"$p({h_j})= \\sigma(\\sum_i w_{ij} x_i)$, where $\\sigma()$ is the logistic function.\n",
"\n",
"\n",
"Now lets see what $p({h_j})$ represents. In fact, it is the probabilities of the hidden units. And, all values together are called <b>probability distribution</b>. That is, RBM uses inputs x to make predictions about hidden node activations. For example, imagine that the values of $h_p$ for the first training item is [0.51 0.84]. It tells you what is the conditional probability for each hidden neuron to be at Phase 1): \n",
"<ul>\n",
" <li>p($h_{1}$ = 1|V) = 0.51</li>\n",
" <li>($h_{2}$ = 1|V) = 0.84</li> \n",
"</ul>\n",
"\n",
"As a result, for each row in the training set, <b>a vector/tensor</b> is generated, which in our case it is of size [1x2], and totally n vectors ($p({h})$=[nx2]). \n",
"\n",
"We then turn unit $h_j$ on with probability $p(h_{j}|V)$, and turn it off with probability $1 - p(h_{j}|V)$.\n",
"\n",
"Therefore, the conditional probability of a configuration of h given v (for a training sample) is:\n",
"\n",
"$$p(\\mathbf{h} \\mid \\mathbf{v}) = \\prod_{j=0}^H p(h_j \\mid \\mathbf{v})$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, sample a hidden activation vector <b>h</b> from this probability distribution $p({h_j})$. That is, we sample the activation vector from the probability distribution of hidden layer values. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we go further, let's look at a toy example for one case out of all input. Assume that we have a trained RBM, and a very simple input vector such as [1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], lets see what would be the output of forward pass:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Input: [[1. 0. 0. 1. 0. 0. 0.]]\n",
"hb: [0.1 0.1]\n",
"w: [[-0.0874574 -0.45373732]\n",
" [-0.45822284 0.64929986]\n",
" [ 0.4849262 0.228631 ]\n",
" [ 0.6104384 0.36003128]\n",
" [-0.02215557 -1.2187916 ]\n",
" [ 0.820069 -2.1434085 ]\n",
" [-1.7950305 1.8821483 ]]\n",
"p(h|v): [[0.65089625 0.5015735 ]]\n",
"h0 states: [[1. 0.]]\n"
]
}
],
"source": [
"sess = tf.Session()\n",
"X = tf.constant([[1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]])\n",
"v_state = X\n",
"print (\"Input: \", sess.run(v_state))\n",
"\n",
"h_bias = tf.constant([0.1, 0.1])\n",
"print (\"hb: \", sess.run(h_bias))\n",
"print (\"w: \", sess.run(W))\n",
"\n",
"# Calculate the probabilities of turning the hidden units on:\n",
"h_prob = tf.nn.sigmoid(tf.matmul(v_state, W) + h_bias) #probabilities of the hidden units\n",
"print (\"p(h|v): \", sess.run(h_prob))\n",
"\n",
"# Draw samples from the distribution:\n",
"h_state = tf.nn.relu(tf.sign(h_prob - tf.random_uniform(tf.shape(h_prob)))) #states\n",
"print (\"h0 states:\", sess.run(h_state))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<b>Phase 2) Backward Pass (Reconstruction):</b>\n",
"The RBM reconstructs data by making several forward and backward passes between the visible and hidden layers.\n",
"\n",
"So, in the second phase (i.e. reconstruction phase), the samples from the hidden layer (i.e. h) play the role of input. That is, <b>h</b> becomes the input in the backward pass. The same weight matrix and visible layer biases are used to go through the sigmoid function. The produced output is a reconstruction which is an approximation of the original input."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"b: [0.1 0.2 0.1 0.1 0.1 0.2 0.1]\n",
"p(vi∣h): [[0.5031356 0.43580064 0.6422001 0.670498 0.5194513 0.734986\n",
" 0.15511544]]\n",
"v probability states: [[0. 1. 0. 1. 0. 1. 0.]]\n"
]
}
],
"source": [
"vb = tf.constant([0.1, 0.2, 0.1, 0.1, 0.1, 0.2, 0.1])\n",
"print (\"b: \", sess.run(vb))\n",
"v_prob = sess.run(tf.nn.sigmoid(tf.matmul(h_state, tf.transpose(W)) + vb))\n",
"print (\"p(vi∣h): \", v_prob)\n",
"v_state = tf.nn.relu(tf.sign(v_prob - tf.random_uniform(tf.shape(v_prob))))\n",
"print (\"v probability states: \", sess.run(v_state))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"RBM learns a probability distribution over the input, and then, after being trained, the RBM can generate new samples from the learned probability distribution. As you know, <b>probability distribution</b>, is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.\n",
"\n",
"The (conditional) probability distribution over the visible units v is given by\n",
"\n",
"$p(\\mathbf{v} \\mid \\mathbf{h}) = \\prod_{i=0}^V p(v_i \\mid \\mathbf{h}),$\n",
"\n",
"\n",
"where,\n",
"\n",
"$p(v_i \\mid \\mathbf{h}) = \\sigma\\left( a_i + \\sum_{j=0}^H w_{ji} h_j \\right)$\n",
"\n",
"so, given current state of hidden units and weights, what is the probability of generating [1. 0. 0. 1. 0. 0. 0.] in reconstruction phase, based on the above <b>probability distribution</b> function?"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1. 0. 0. 1. 0. 0. 0.]]\n",
"[0.5031356 0.43580064 0.6422001 0.670498 0.5194513 0.734986\n",
" 0.15511544]\n"
]
},
{
"data": {
"text/plain": [
"0.007327552123952831"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inp = sess.run(X)\n",
"print(inp)\n",
"print(v_prob[0])\n",
"v_probability = 1\n",
"for elm, p in zip(inp[0],v_prob[0]) :\n",
" if elm ==1:\n",
" v_probability *= p\n",
" else:\n",
" v_probability *= (1-p)\n",
"v_probability"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How similar X and V vectors are? Of course, the reconstructed values most likely will not look anything like the input vector because our network has not trained yet. Our objective is to train the model in such a way that the input vector and reconstructed vector to be same. Therefore, based on how different the input values look to the ones that we just reconstructed, the weights are adjusted. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"<h2>MNIST</h2>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will be using the MNIST dataset to practice the usage of RBMs. The following cell loads the MNIST dataset."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING:tensorflow:From <ipython-input-10-a0c1bc5755ed>:1: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use alternatives such as official/mnist/dataset.py from tensorflow/models.\n",
"WARNING:tensorflow:From /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please write your own downloading logic.\n",
"WARNING:tensorflow:From /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use tf.data to implement this functionality.\n",
"Extracting MNIST_data/train-images-idx3-ubyte.gz\n",
"WARNING:tensorflow:From /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use tf.data to implement this functionality.\n",
"Extracting MNIST_data/train-labels-idx1-ubyte.gz\n",
"WARNING:tensorflow:From /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use tf.one_hot on tensors.\n",
"Extracting MNIST_data/t10k-images-idx3-ubyte.gz\n",
"Extracting MNIST_data/t10k-labels-idx1-ubyte.gz\n",
"WARNING:tensorflow:From /home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use alternatives such as official/mnist/dataset.py from tensorflow/models.\n"
]
}
],
"source": [
"mnist = input_data.read_data_sets(\"MNIST_data/\", one_hot=True)\n",
"trX, trY, teX, teY = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets look at the dimension of the images."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(784,)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trX[1].shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"MNIST images have 784 pixels, so the visible layer must have 784 input nodes. For our case, we'll use 50 nodes in the hidden layer, so i = 50."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"vb = tf.placeholder(\"float\", [784])\n",
"hb = tf.placeholder(\"float\", [50])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let <b>W</b> be the Tensor of 784x50 (784 - number of visible neurons, 50 - number of hidden neurons) that represents weights between the neurons. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"W = tf.placeholder(\"float\", [784, 50])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets define the visible layer:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"v0_state = tf.placeholder(\"float\", [None, 784])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can define hidden layer:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"h0_prob = tf.nn.sigmoid(tf.matmul(v0_state, W) + hb) #probabilities of the hidden units\n",
"h0_state = tf.nn.relu(tf.sign(h0_prob - tf.random_uniform(tf.shape(h0_prob)))) #sample_h_given_X"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we define reconstruction part:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"v1_prob = tf.nn.sigmoid(tf.matmul(h0_state, tf.transpose(W)) + vb) \n",
"v1_state = tf.nn.relu(tf.sign(v1_prob - tf.random_uniform(tf.shape(v1_prob)))) #sample_v_given_h"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>What is objective function?</h3>\n",
"\n",
"<b>Goal</b>: Maximize the likelihood of our data being drawn from that distribution\n",
"\n",
"<b>Calculate error:</b> \n",
"In each epoch, we compute the \"error\" as a sum of the squared difference between step 1 and step n,\n",
"e.g the error shows the difference between the data and its reconstruction.\n",
"\n",
"<b>Note:</b> tf.reduce_mean computes the mean of elements across dimensions of a tensor."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"err = tf.reduce_mean(tf.square(v0_state - v1_state))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ref4\"></a>\n",
"<h3>How to train the model?</h3>\n",
"<b>Warning!!</b> The following part discuss how to train the model which needs some algebra background. Still, you can skip this part and run the next cells.\n",
"\n",
"As mentioned, we want to give a high probability to the input data we train on. So, in order to train an RBM, we have to maximize the product of probabilities assigned to all rows v (images) in the training set V (a matrix, where each row of it is treated as a visible vector v):\n",
"\n",
"<img src=\"https://wikimedia.org/api/rest_v1/media/math/render/svg/d42e9f5aad5e1a62b11b119c9315236383c1864a\">\n",
"\n",
"\n",
"Which is equivalent, maximizing the expected log probability of V:\n",
"\n",
"\n",
"<img src=\"https://wikimedia.org/api/rest_v1/media/math/render/svg/ba0ceed99dca5ff1d21e5ace23f5f2223f19efc0\">\n",
"\n",
"\n",
"So, we have to update the weights wij to increase p(v) for all v in our training data during training. So we have to calculate the derivative:\n",
"\n",
"\n",
"$$\\frac{\\partial \\log p(\\mathbf v)}{\\partial w_{ij}}$$\n",
"\n",
"This cannot be easily done by typical <b>gradient descent (SGD)</b>, so we can use another approach, which has 2 steps:\n",
"<ol>\n",
" <li>Gibbs Sampling</li>\n",
" <li>Contrastive Divergence</li>\n",
"</ol> \n",
" \n",
"<h3>Gibbs Sampling</h3> \n",
"First, given an input vector v we are using p(h|v) for prediction of the hidden values h. \n",
"<ul>\n",
" <li>$p(h|v) = sigmoid(X \\otimes W + hb)$</li>\n",
" <li>h0 = sampleProb(h0)</li>\n",
"</ul>\n",
" \n",
"Then, knowing the hidden values, we use p(v|h) for reconstructing of new input values v. \n",
"<ul>\n",
" <li>$p(v|h) = sigmoid(h0 \\otimes transpose(W) + vb)$</li>\n",
" <li>$v1 = sampleProb(v1)$ (Sample v given h)</li>\n",
"</ul>\n",
" \n",
"This process is repeated k times. After k iterations we obtain an other input vector vk which was recreated from original input values v0 or X.\n",
"\n",
"Reconstruction steps:\n",
"<ul>\n",
" <li> Get one data point from data set, like <i>x</i>, and pass it through the net</li>\n",
" <li>Pass 0: (x) $\\Rightarrow$ (h0) $\\Rightarrow$ (v1) (v1 is reconstruction of the first pass)</li>\n",
" <li>Pass 1: (v1) $\\Rightarrow$ (h1) $\\Rightarrow$ (v2) (v2 is reconstruction of the second pass)</li>\n",
" <li>Pass 2: (v2) $\\Rightarrow$ (h2) $\\Rightarrow$ (v3) (v3 is reconstruction of the third pass)</li>\n",
" <li>Pass n: (vk) $\\Rightarrow$ (hk+1) $\\Rightarrow$ (vk+1)(vk is reconstruction of the nth pass)</li>\n",
"</ul>\n",
" \n",
"<h4>What is sampling here (sampleProb)?</h4>\n",
"\n",
"In forward pass: We randomly set the values of each hi to be 1 with probability $sigmoid(v \\otimes W + hb)$. \n",
"- To sample h given v means to sample from the conditional probability distribution P(h|v). It means that you are asking what are the probabilities of getting a specific set of values for the hidden neurons, given the values v for the visible neurons, and sampling from this probability distribution. \n",
"In reconstruction: We randomly set the values of each vi to be 1 with probability $ sigmoid(h \\otimes transpose(W) + vb)$.\n",
"\n",
"<h3>contrastive divergence (CD-k)</h3>\n",
"The update of the weight matrix is done during the Contrastive Divergence step. \n",
"\n",
"Vectors v0 and vk are used to calculate the activation probabilities for hidden values h0 and hk. The difference between the outer products of those probabilities with input vectors v0 and vk results in the update matrix:\n",
"\n",
"\n",
"$\\Delta W =v0 \\otimes h0 - vk \\otimes hk$ \n",
"\n",
"Contrastive Divergence is actually matrix of values that is computed and used to adjust values of the W matrix. Changing W incrementally leads to training of W values. Then on each step (epoch), W is updated to a new value W' through the equation below:\n",
"\n",
"$W' = W + alpha * \\Delta W$ \n",
"\n",
" \n",
"<b>What is Alpha?</b> \n",
"Here, alpha is some small step rate and is also known as the \"learning rate\".\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ok, lets assume that k=1, that is we just get one more step:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"h1_prob = tf.nn.sigmoid(tf.matmul(v1_state, W) + hb)\n",
"h1_state = tf.nn.relu(tf.sign(h1_prob - tf.random_uniform(tf.shape(h1_prob)))) #sample_h_given_X"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"alpha = 0.01\n",
"W_Delta = tf.matmul(tf.transpose(v0_state), h0_prob) - tf.matmul(tf.transpose(v1_state), h1_prob)\n",
"update_w = W + alpha * W_Delta\n",
"update_vb = vb + alpha * tf.reduce_mean(v0_state - v1_state, 0)\n",
"update_hb = hb + alpha * tf.reduce_mean(h0_state - h1_state, 0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start a session and initialize the variables:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"cur_w = np.zeros([784, 50], np.float32)\n",
"cur_vb = np.zeros([784], np.float32)\n",
"cur_hb = np.zeros([50], np.float32)\n",
"prv_w = np.zeros([784, 50], np.float32)\n",
"prv_vb = np.zeros([784], np.float32)\n",
"prv_hb = np.zeros([50], np.float32)\n",
"sess = tf.Session()\n",
"init = tf.global_variables_initializer()\n",
"sess.run(init)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets look at the error of the first run:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.4815421"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sess.run(err, feed_dict={v0_state: trX, W: prv_w, vb: prv_vb, hb: prv_hb})"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch: 0 reconstruction error: 0.099173\n",
"Epoch: 1 reconstruction error: 0.095754\n",
"Epoch: 2 reconstruction error: 0.093471\n",
"Epoch: 3 reconstruction error: 0.092173\n",
"Epoch: 4 reconstruction error: 0.091135\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"#Parameters\n",
"epochs = 5\n",
"batchsize = 100\n",
"weights = []\n",
"errors = []\n",
"\n",
"for epoch in range(epochs):\n",
" for start, end in zip( range(0, len(trX), batchsize), range(batchsize, len(trX), batchsize)):\n",
" batch = trX[start:end]\n",
" cur_w = sess.run(update_w, feed_dict={ v0_state: batch, W: prv_w, vb: prv_vb, hb: prv_hb})\n",
" cur_vb = sess.run(update_vb, feed_dict={v0_state: batch, W: prv_w, vb: prv_vb, hb: prv_hb})\n",
" cur_hb = sess.run(update_hb, feed_dict={ v0_state: batch, W: prv_w, vb: prv_vb, hb: prv_hb})\n",
" prv_w = cur_w\n",
" prv_vb = cur_vb\n",
" prv_hb = cur_hb\n",
" if start % 10000 == 0:\n",
" errors.append(sess.run(err, feed_dict={v0_state: trX, W: cur_w, vb: cur_vb, hb: cur_hb}))\n",
" weights.append(cur_w)\n",
" print ('Epoch: %d' % epoch,'reconstruction error: %f' % errors[-1])\n",
"plt.plot(errors)\n",
"plt.xlabel(\"Batch Number\")\n",
"plt.ylabel(\"Error\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What is the final weight after training?"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[-1.1634085 -1.2753986 -1.2005095 ... -1.2331337 -1.2638354\n",
" -1.2301917 ]\n",
" [-1.119033 -1.0385447 -1.1497291 ... -1.1500239 -1.1115321\n",
" -1.0415173 ]\n",
" [-0.28079224 -0.2849885 -0.2811861 ... -0.26885316 -0.27597892\n",
" -0.26900616]\n",
" ...\n",
" [-0.31457013 -0.29599422 -0.2964135 ... -0.2875892 -0.29161546\n",
" -0.27342165]\n",
" [-0.29064775 -0.28894693 -0.30075338 ... -0.27427268 -0.28281838\n",
" -0.31936294]\n",
" [-1.8031325 -1.6510723 -1.5959315 ... -1.791442 -1.7346625\n",
" -1.7567236 ]]\n"
]
}
],
"source": [
"uw = weights[-1].T\n",
"print (uw) # a weight matrix of shape (50,784)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ref5\"></a>\n",
"<h3>Learned features</h3> "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can take each hidden unit and visualize the connections between that hidden unit and each element in the input vector. In our case, we have 50 hidden units. Lets visualize those."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's plot the current weights:\n",
"<b>tile_raster_images</b> helps in generating an easy to grasp image from a set of samples or weights. It transform the <b>uw</b> (with one flattened image per row of size 784), into an array (of size $25\\times20$) in which images are reshaped and laid out like tiles on a floor."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1296x1296 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"tile_raster_images(X=cur_w.T, img_shape=(28, 28), tile_shape=(5, 10), tile_spacing=(1, 1))\n",
"import matplotlib.pyplot as plt\n",
"from PIL import Image\n",
"%matplotlib inline\n",
"image = Image.fromarray(tile_raster_images(X=cur_w.T, img_shape=(28, 28) ,tile_shape=(5, 10), tile_spacing=(1, 1)))\n",
"### Plot image\n",
"plt.rcParams['figure.figsize'] = (18.0, 18.0)\n",
"imgplot = plt.imshow(image)\n",
"imgplot.set_cmap('gray') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each tile in the above visualization corresponds to a vector of connections between a hidden unit and visible layer's units. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at one of the learned weights corresponding to one of hidden units for example. In this particular square, the gray color represents weight = 0, and the whiter it is, the more positive the weights are (closer to 1). Conversely, the darker pixels are, the more negative the weights. The positive pixels will increase the probability of activation in hidden units (after multiplying by input/visible pixels), and negative pixels will decrease the probability of a unit hidden to be 1 (activated). So, why is this important? So we can see that this specific square (hidden unit) can detect a feature (e.g. a \"/\" shape) and if it exists in the input."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAThUlEQVR4nO3dW4hd130G8O+zbEvWyNFtdBks4aTG4JpCnTKYgktxCQ6OXuw8pMQPwQVT5SGGBPJQ4z7Ej6Y0CXkoAaU2UUrqEEh8eTBtjAmYgAkeG8WWqraS7WkiaaSxdbFGkkeypH8fZjtM7Nn/7+Ssc84+eH0/GGbmrNlnr7PP+c+5fHutxYiAmX3yXdN1B8xsNFzsZpVwsZtVwsVuVgkXu1klrh3lztasWRMTExOt7ddck//vKUkO1LYk+75uRV236luXfe9y310qvc+UbHu176x9YWEBi4uLK/5BUbGTvBfA9wCsAvCvEfF49vcTExPYtWtXa/vatWvT/X3wwQd99HLJ5cuX03Z1gLN/RGrbVatWpe1XrlxJ21Xfs+tXfbt69WpRe8lxU9etjpvaPqOeWNS+L1261Pe+gfyxfN1116XbXn/99a1tzzzzTGtb3y/jSa4C8C8AvgDgdgAPkLy93+szs+Eqec9+J4DDEfFWRFwC8BMA9w2mW2Y2aCXFfhOA3y37/Uhz2R8guZvkDMmZixcvFuzOzEqUFPtKb9Y+9qlDROyJiOmImF69enXB7sysREmxHwGwc9nvOwAcK+uOmQ1LSbG/AuBWkp8heT2ALwN4bjDdMrNB6zt6i4jLJB8G8J9Yit6ejIgD2TYk01hBvafP3gaoba+9Nr+pKsbJ4rHSTLY0uivJuksjJEUd98zi4mLaruKz7D5ds2ZNuq06Luo+Vbc7i8+GdZ8U5ewR8TyA5wfUFzMbIp8ua1YJF7tZJVzsZpVwsZtVwsVuVgkXu1klRjqevdQws241zDQ7P0ANvS0dr676llHnD6gsW1HDb7P9qzxZ5ehq31mWvm7dunRb9Xgqzemz43LjjTem2/b7ePAzu1klXOxmlXCxm1XCxW5WCRe7WSVc7GaVGHn0lkUOKoIqmV22dJhp1u/SGVzV8NySeEzd7nPnzhXtW/X9zJkzaXtGRW9q31mElU1p3kv7xo0b03YV7WW3rWT6tuz+9jO7WSVc7GaVcLGbVcLFblYJF7tZJVzsZpVwsZtVYuQ5e5YDqjw6m563dEVQlW1mwylVHqyyajVk8ZZbbknbt23b1tqm8t4LFy6k7SonP3r0aNp+6NCh1rb9+/en26phoqo9G5a8YcOGdNv169en7eo+U4+J7H4Z1jLYfmY3q4SL3awSLnazSrjYzSrhYjerhIvdrBIudrNKjDxnzzJElYVnWXfpVNIqp8+uX2XZO3fuLGrftGlT2p5Na5xlzQBw/vz5tH3Lli1p++TkZNp+2223tbbdc8896bZqqmh1/kK2LLI6LmrJZbW9mg466/uxY8fSbU+fPp22tykqdpKzABYAXAFwOSKmS67PzIZnEM/sfxMR7w7gesxsiPye3awSpcUeAH5B8lWSu1f6A5K7Sc6QnCldasjM+lf6Mv6uiDhGciuAF0j+d0S8tPwPImIPgD0AMDk5mX9KZmZDU/TMHhHHmu/zAJ4GcOcgOmVmg9d3sZOcIHnjhz8D+DyAfMyimXWm5GX8NgBPN/nztQD+PSL+Q22U5dUl43hV7qkyfJUXZ7mp6vfWrVvT9rVr16bt6hyAbEy6Or9AjbvOsmpAj/vO+q5u1/vvv5+2q3MEsr6px4sa737DDTek7er6s/s8mwMAAF5++eW0vU3fxR4RbwH48363N7PRcvRmVgkXu1klXOxmlXCxm1XCxW5WibGaSlrFGRkVAanrzoaJAnlUUjocUsVjaqhndv3qdilqmWw1BDbrmzp9Wu1bta9evbq1TS25/KlPfSptV4+3kiGwanrvfvmZ3awSLnazSrjYzSrhYjerhIvdrBIudrNKuNjNKjHynD2jctNsGOr27dvTbVVWrYZyZrmpGuKqhpGqLFwtTZz1Xd0udVwmJiaK2rPrV1m0Oi7q3IksZ1fTf6upoFUOr4bIZkOuS4ZrZ49FP7ObVcLFblYJF7tZJVzsZpVwsZtVwsVuVgkXu1klxmrJ5pJt1bTEasy5ypuz7UvHyqtjosaMZ+PhVcavMt2SpaxVu9q3ovad5dFZBg/oaaxV39X5DVnfjx8/nm6bPRads5uZi92sFi52s0q42M0q4WI3q4SL3awSLnazSow0Z48IXLlypbVdjY3O5m7PrhcALl68KPuWUTl9Ro1HV8tFq7HXJftWefGpU6fSdpXDnzlzprVNnQOgxpSXnL/wzjvvpNuq43b06NG0XY13z+ZuePvtt9Nts3NC0nMu0msFQPJJkvMk9y+7bBPJF0gear7nM+6bWed6eRn/QwD3fuSyRwC8GBG3Anix+d3Mxpgs9oh4CcBHX8vdB2Bv8/NeAPcPuF9mNmD9fkC3LSLmAKD5vrXtD0nuJjlDcka9bzaz4Rn6p/ERsScipiNiWg0+MLPh6bfYT5CcAoDm+/zgumRmw9BvsT8H4MHm5wcBPDuY7pjZsMjwmORTAO4GMEnyCIBvAXgcwE9JPgTgtwC+1MvOSKZ5tco2z58/38tu+rrukrHRKu9VOblqV9efrXOuzg9Qa4GfPXs2bT99+nTanuXJ6vwC1Xd1bkSW8at+nzhxIm1Xaxyox1P2eFTj2bNts2Miiz0iHmhp+pza1szGh0+XNauEi92sEi52s0q42M0q4WI3q8TIp5LOogEVVywsLLS2qamg1RBYJYtaVESkhjuqvqnYMNs+i+UAHa2VxJ0AsH79+ta2bMgyoI+bum3ZMFQ1jFTFXydPnkzb1WM5u+1q26yGioa4mtkng4vdrBIudrNKuNjNKuFiN6uEi92sEi52s0qMPGfPph5WuWm/1wvoJXhV1p1Nc62mY1ZDWDdt2pS2qymXs76rYaBqumY1lZg67tnwXLXU9dzcXNquzhHIsvQ333wz3fbIkSNF+96wYUPavmPHjta2bDg1kC8H7SWbzczFblYLF7tZJVzsZpVwsZtVwsVuVgkXu1klRp6zZ5mxyoSzPFmN+Vbjj9XUwps3b25tU1mzypNV1q3Gy2ft6rrVPABqTLm67dm4bTVds8q6VVY+OzvbVxugz8soncMgmx9B1UGWs2f8zG5WCRe7WSVc7GaVcLGbVcLFblYJF7tZJVzsZpUYec6eUZltlrOrJXSzOecBPT65ZN54NT+66ns2ll61q3nf1XLQ6rapsfjZcVVzsx84cCBtP3z4cNqe5fSnTp1Kt1X3mZrDQM1BkFH3icrh28gekXyS5DzJ/csue4zkUZL7mq9dfe3dzEaml38/PwRw7wqXfzci7mi+nh9st8xs0GSxR8RLAPLXPGY29ko+oHuY5OvNy/yNbX9EcjfJGZIzJXPMmVmZfov9+wBuAXAHgDkA3277w4jYExHTETGtPngws+Hpq9gj4kREXImIqwB+AODOwXbLzAatr2InObXs1y8C2N/2t2Y2HmTOTvIpAHcDmCR5BMC3ANxN8g4AAWAWwFd72VlEpFn56tWr0+2z9/zq8wA1blttn70FUdteuHAhbVdZtup7lruqsc9qrL3K0dX86Nm4cDWHgMrhVfv8/Hxrm7rdinpLqh7L2X2uzjfpdzy7LPaIeGCFi5/oa29m1hmfLmtWCRe7WSVc7GaVcLGbVcLFblaJkQ5xJZkuKauGemZxhYqv1DDRqamptH1ycrK1TcUwagleFQNt3Nh6NjKAPP5SxyW7PwAd86ihnGqoaEZFUCqSzO4XNcW2al+/fn3aro6Lekxk1PLibfzMblYJF7tZJVzsZpVwsZtVwsVuVgkXu1klXOxmlRirqaSVLF+8ePFiuq3KTW+++ea0PVuCV00rrPa9ZcuWtF1df9Y3lcmqPFhl2WqIazaVtTqvQmXR6rhkw3PVtur8BHVuhTqvIzu/QZ370O801X5mN6uEi92sEi52s0q42M0q4WI3q4SL3awSLnazSoxVzl4ytjrLmgGd2W7evDltzzJbNbZZZbLr1q1L29W47iyvPnfuXLqtymzVOQIqK8+mklbHpTRvzs69UI8H1a76pm5bdp+qaajVvtv4md2sEi52s0q42M0q4WI3q4SL3awSLnazSrjYzSox8pw9yy9LxlarXFTNX67asxxe5exq7HS25DKgx5SfPHmyte3SpUvptuq4lewbyG+byvBLzz/IMn41Xl3N5a+ycHX+QfZ4U9tmfcsyePnMTnInyV+SPEjyAMmvN5dvIvkCyUPN93wlAzPrVC8v4y8D+GZE/CmAvwTwNZK3A3gEwIsRcSuAF5vfzWxMyWKPiLmIeK35eQHAQQA3AbgPwN7mz/YCuH9YnTSzcn/UB3QkPw3gswB+DWBbRMwBS/8QAGxt2WY3yRmSM4uLi2W9NbO+9VzsJNcB+BmAb0TE2V63i4g9ETEdEdNqcICZDU9PxU7yOiwV+o8j4ufNxSdITjXtUwDmh9NFMxsEGb1x6bP8JwAcjIjvLGt6DsCDAB5vvj9b2hkVpWQxknrVoOIMFY9lEZWKaVSEtLCwkLar+CubLlpFb6pvanv11iw77up2qX2r+zSL/UpfZaqYWD2Ws4hMPRaz6bmz/faSs98F4CsA3iC5r7nsUSwV+U9JPgTgtwC+1MN1mVlHZLFHxK8AtP0b+txgu2Nmw+LTZc0q4WI3q4SL3awSLnazSrjYzSox8iGuWSassstsGVw1TFQN5VRDHrNcVE3XrK5b5apq2eVsmusskwWArVtXPMv591QOr/LqbJhp6XLSalnkbHt1zNW5E+rxpoZMZ49Hdf5BNry2aIirmX0yuNjNKuFiN6uEi92sEi52s0q42M0q4WI3q8TIc/YsB1RjgLP8MVvOGQB27NiRtqtcNMtdVdasMl2lZNrjLOcG9FLXKm9Wywe/9957aXtG5ejZ+QUAcOHChdY2dX+raa7VcVOy+1SN08+Wos7yfz+zm1XCxW5WCRe7WSVc7GaVcLGbVcLFblYJF7tZJUaes2djjFXOnuWLKstWOfzU1FTaPjk52dqmMlk1ll71XeWu2fzqWdbcy74VleOXnBuh2tVY/OzxVDJmvBcl88qrcxc8nt3MUi52s0q42M0q4WI3q4SL3awSLnazSrjYzSrRy/rsOwH8CMB2AFcB7ImI75F8DMDfA3in+dNHI+L57LoiIs2EVa5akn2ePXs2bVdzv2/fvr21TfVLzTGu5nZXOXvWrrZV1DkCKq8umdNejVdXxzVdq1zMEaDWhs/O+QD0YyI7rup2qQy/TS8n1VwG8M2IeI3kjQBeJflC0/bdiPjnvvZsZiPVy/rscwDmmp8XSB4EcNOwO2Zmg/VHvR4g+WkAnwXw6+aih0m+TvJJkhtbttlNcobkzOLiYlFnzax/PRc7yXUAfgbgGxFxFsD3AdwC4A4sPfN/e6XtImJPRExHxLSaq83MhqenYid5HZYK/ccR8XMAiIgTEXElIq4C+AGAO4fXTTMrJYudS8NongBwMCK+s+zy5cPEvghg/+C7Z2aD0sun8XcB+AqAN0juay57FMADJO8AEABmAXxVXRHJNF4riYnUsMDjx4+n7Wpq4Uw2/BUoH0aqZMNY1eckaqro0lgxW5ZZ3d8qilX3WRa3qmhNxVvqLamK9kqG/qqh4K19Un8QEb8CsFIlpZm6mY0Xn0FnVgkXu1klXOxmlXCxm1XCxW5WCRe7WSVGOpV0RKT5osp0s8xYZY8qVz18+HDaPjs729qmlu8tnUp6w4YNaXuWN6vzD1RerLZXefS7777b2jY3N5duu7CwkLar+zS7ber8AHWfqCGu6vGY5fDqmGc15CWbzczFblYLF7tZJVzsZpVwsZtVwsVuVgkXu1klqPLGge6MfAfA/y27aBJAexDbrXHt27j2C3Df+jXIvt0cEVtWahhpsX9s5+RMREx31oHEuPZtXPsFuG/9GlXf/DLerBIudrNKdF3sezref2Zc+zau/QLct36NpG+dvmc3s9Hp+pndzEbExW5WiU6KneS9JP+H5GGSj3TRhzYkZ0m+QXIfyZmO+/IkyXmS+5ddtonkCyQPNd9XXGOvo749RvJoc+z2kdzVUd92kvwlyYMkD5D8enN5p8cu6ddIjtvI37OTXAXgfwHcA+AIgFcAPBAR/zXSjrQgOQtgOiI6PwGD5F8DOAfgRxHxZ81l/wTgVEQ83vyj3BgR/zAmfXsMwLmul/FuViuaWr7MOID7AfwdOjx2Sb/+FiM4bl08s98J4HBEvBURlwD8BMB9HfRj7EXESwBOfeTi+wDsbX7ei6UHy8i19G0sRMRcRLzW/LwA4MNlxjs9dkm/RqKLYr8JwO+W/X4E47XeewD4BclXSe7uujMr2BYRc8DSgwfA1o7781FyGe9R+sgy42Nz7PpZ/rxUF8W+0gRb45T/3RURfwHgCwC+1rxctd70tIz3qKywzPhY6Hf581JdFPsRADuX/b4DwLEO+rGiiDjWfJ8H8DTGbynqEx+uoNt8n++4P783Tst4r7TMOMbg2HW5/HkXxf4KgFtJfobk9QC+DOC5DvrxMSQnmg9OQHICwOcxfktRPwfgwebnBwE822Ff/sC4LOPdtsw4Oj52nS9/HhEj/wKwC0ufyL8J4B+76ENLv/4EwG+arwNd9w3AU1h6WfcBll4RPQRgM4AXARxqvm8ao779G4A3ALyOpcKa6qhvf4Wlt4avA9jXfO3q+tgl/RrJcfPpsmaV8Bl0ZpVwsZtVwsVuVgkXu1klXOxmlXCxm1XCxW5Wif8HwjVTKSHGdzMAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 288x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"from PIL import Image\n",
"image = Image.fromarray(tile_raster_images(X =cur_w.T[10:11], img_shape=(28, 28),tile_shape=(1, 1), tile_spacing=(1, 1)))\n",
"### Plot image\n",
"plt.rcParams['figure.figsize'] = (4.0, 4.0)\n",
"imgplot = plt.imshow(image)\n",
"imgplot.set_cmap('gray') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the reconstruction of an image now. Imagine that we have a destructed image of figure 3. Lets see if our trained network can fix it:\n",
"\n",
"First we plot the image:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2019-12-29 14:48:03-- https://ibm.box.com/shared/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg\n",
"Resolving ibm.box.com (ibm.box.com)... 107.152.26.197, 107.152.27.197\n",
"Connecting to ibm.box.com (ibm.box.com)|107.152.26.197|:443... connected.\n",
"HTTP request sent, awaiting response... 301 Moved Permanently\n",
"Location: /public/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg [following]\n",
"--2019-12-29 14:48:03-- https://ibm.box.com/public/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg\n",
"Reusing existing connection to ibm.box.com:443.\n",
"HTTP request sent, awaiting response... 301 Moved Permanently\n",
"Location: https://ibm.ent.box.com/public/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg [following]\n",
"--2019-12-29 14:48:03-- https://ibm.ent.box.com/public/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg\n",
"Resolving ibm.ent.box.com (ibm.ent.box.com)... 107.152.27.211, 107.152.26.211\n",
"Connecting to ibm.ent.box.com (ibm.ent.box.com)|107.152.27.211|:443... connected.\n",
"HTTP request sent, awaiting response... 302 Found\n",
"Location: https://public.boxcloud.com/d/1/b1!ZegRC3L_tWG3udcbnfMP5L5EcCbuBid1Oc72Xq5axhOwrVgPYzhsMLAlYpWMF_JKMvnFnQCaCnVre1KEsr_JMBULbCWpoMEhon2u4CK0SAlsSAh0pD5zBNyWNwUZepo6N0kEoBSjNggk-iBx3JE5gzI8MCyVPBP_xA21O0Ov_kWpfoswW3I9FaMZyzu9RyZvcLuWtTsqIuMrx64Aw2bjrfkLjYc2gQ8FBxTROfdgJf_dvrYgUH7YXsY-jNZhRP3H83crw9AntCaUNZVkQHiW4m-TL-mPVQOzAjQUHT358zZIdfS-QIrkg_UG6YQuMTEQAKzrWznJKsGRrzNmffGRe_RQb-mFKMGvakrNM-_784COdXOppqjhIhds6yBFu2MhMGqMFzCtLdq1f2OlDd9HhrSazUtHumoLDsQEAggpYgYxu35uqdPW-zAm5Q9rTTrYJVDWBS5fdabc0QTWO-QIkQX499bXeb9xWs0PHVBeMzHwwUSU13sFqHN2_9Us9GxieWP-gPFyKELVwCb8vNEDM7L3z-0lhhiRF227f6mGqiO7wBgLnUoogmLjc5C9lRb3XWx5nzKLM3hJ4zS8faWYz0cBCWSCBT_va1l8kJV8tOVgaL-J8ln-AvfBKDJJ8ZJgDshXa7yYhgTnEGV6t3vhgYkqmAm9mRZ2IdhgF9r6C4S2apTXVDgNxnlB0nkfeNpTLJzvewNmjlrBbTz_WE2-3ScV1N37A60Iz0ocbjdetkF_80A3mZIit0X7dtS174DiInXbQY_kLAPjKkYY4ghWsBLomneQOwNwn8ZwaaM-rOnqg--XswU0Zxj_YY_If2xFi9ibsB7mIhjlJzZwDOJKynzYV6YI21fj6unn5_VU-63QpFLkwoSYiTiWJoOfBpT3YmBTCWcYs9kMRig6FmluNluZZYBKoRqVGlpU2QnvgK8MPk9UAOrbMeVhUAA4S-8YYtEhZyAeY8ul525lH4U0zJPdZHwoY1c3FovtrhlLulMl3HWgcjuwA4ph1XHv2fyYs_M596q7rJ-PtZUXql1x1WqMv-Qy1ZjQUwIwIJ-54bMZkuLrF6224dTtZaC-bHkGZTT1Z15l6HpCwY6dOQ2MwswTOufCQ7UYhpE3MfmKKtTye474PFPr61sLCnJPsMXfXvOgxj91yF37olckaOJk-dcdZQeE3v3TfsMzwohcar5lvgVjjHhtYAortUMF-yFCoNrUQ06KlaBivup_J_CFr-mjhIVe20gVMZ_yet0q4gM6oVIdEjtFefp-26nd5iRp2HWu9o3Cv3SJkbSkGd_TkOf4x1XaFCs0Wu215CpEz7OHWrDNYTZap-6enTg9dBdrQU_-fjtUQLnq7yJVb4ZPUYTgrZSASpCp_2-s9W-kADMW/download [following]\n",
"--2019-12-29 14:48:04-- https://public.boxcloud.com/d/1/b1!ZegRC3L_tWG3udcbnfMP5L5EcCbuBid1Oc72Xq5axhOwrVgPYzhsMLAlYpWMF_JKMvnFnQCaCnVre1KEsr_JMBULbCWpoMEhon2u4CK0SAlsSAh0pD5zBNyWNwUZepo6N0kEoBSjNggk-iBx3JE5gzI8MCyVPBP_xA21O0Ov_kWpfoswW3I9FaMZyzu9RyZvcLuWtTsqIuMrx64Aw2bjrfkLjYc2gQ8FBxTROfdgJf_dvrYgUH7YXsY-jNZhRP3H83crw9AntCaUNZVkQHiW4m-TL-mPVQOzAjQUHT358zZIdfS-QIrkg_UG6YQuMTEQAKzrWznJKsGRrzNmffGRe_RQb-mFKMGvakrNM-_784COdXOppqjhIhds6yBFu2MhMGqMFzCtLdq1f2OlDd9HhrSazUtHumoLDsQEAggpYgYxu35uqdPW-zAm5Q9rTTrYJVDWBS5fdabc0QTWO-QIkQX499bXeb9xWs0PHVBeMzHwwUSU13sFqHN2_9Us9GxieWP-gPFyKELVwCb8vNEDM7L3z-0lhhiRF227f6mGqiO7wBgLnUoogmLjc5C9lRb3XWx5nzKLM3hJ4zS8faWYz0cBCWSCBT_va1l8kJV8tOVgaL-J8ln-AvfBKDJJ8ZJgDshXa7yYhgTnEGV6t3vhgYkqmAm9mRZ2IdhgF9r6C4S2apTXVDgNxnlB0nkfeNpTLJzvewNmjlrBbTz_WE2-3ScV1N37A60Iz0ocbjdetkF_80A3mZIit0X7dtS174DiInXbQY_kLAPjKkYY4ghWsBLomneQOwNwn8ZwaaM-rOnqg--XswU0Zxj_YY_If2xFi9ibsB7mIhjlJzZwDOJKynzYV6YI21fj6unn5_VU-63QpFLkwoSYiTiWJoOfBpT3YmBTCWcYs9kMRig6FmluNluZZYBKoRqVGlpU2QnvgK8MPk9UAOrbMeVhUAA4S-8YYtEhZyAeY8ul525lH4U0zJPdZHwoY1c3FovtrhlLulMl3HWgcjuwA4ph1XHv2fyYs_M596q7rJ-PtZUXql1x1WqMv-Qy1ZjQUwIwIJ-54bMZkuLrF6224dTtZaC-bHkGZTT1Z15l6HpCwY6dOQ2MwswTOufCQ7UYhpE3MfmKKtTye474PFPr61sLCnJPsMXfXvOgxj91yF37olckaOJk-dcdZQeE3v3TfsMzwohcar5lvgVjjHhtYAortUMF-yFCoNrUQ06KlaBivup_J_CFr-mjhIVe20gVMZ_yet0q4gM6oVIdEjtFefp-26nd5iRp2HWu9o3Cv3SJkbSkGd_TkOf4x1XaFCs0Wu215CpEz7OHWrDNYTZap-6enTg9dBdrQU_-fjtUQLnq7yJVb4ZPUYTgrZSASpCp_2-s9W-kADMW/download\n",
"Resolving public.boxcloud.com (public.boxcloud.com)... 107.152.24.200\n",
"Connecting to public.boxcloud.com (public.boxcloud.com)|107.152.24.200|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 24383 (24K) [image/jpeg]\n",
"Saving to: ‘destructed3.jpg’\n",
"\n",
"destructed3.jpg 100%[===================>] 23.81K --.-KB/s in 0.06s \n",
"\n",
"2019-12-29 14:48:04 (417 KB/s) - ‘destructed3.jpg’ saved [24383/24383]\n",
"\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=181x181 at 0x7FAB178B7EB8>"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"!wget -O destructed3.jpg https://ibm.box.com/shared/static/vvm1b63uvuxq88vbw9znpwu5ol380mco.jpg\n",
"img = Image.open('destructed3.jpg')\n",
"img"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's pass this image through the net:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# convert the image to a 1d numpy array\n",
"sample_case = np.array(img.convert('I').resize((28,28))).ravel().reshape((1, -1))/255.0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Feed the sample case into the network and reconstruct the output:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hh0_p = tf.nn.sigmoid(tf.matmul(v0_state, W) + hb)\n",
"#hh0_s = tf.nn.relu(tf.sign(hh0_p - tf.random_uniform(tf.shape(hh0_p)))) \n",
"hh0_s = tf.round(hh0_p)\n",
"hh0_p_val,hh0_s_val = sess.run((hh0_p, hh0_s), feed_dict={ v0_state: sample_case, W: prv_w, hb: prv_hb})\n",
"print(\"Probability nodes in hidden layer:\" ,hh0_p_val)\n",
"print(\"activated nodes in hidden layer:\" ,hh0_s_val)\n",
"\n",
"# reconstruct\n",
"vv1_p = tf.nn.sigmoid(tf.matmul(hh0_s_val, tf.transpose(W)) + vb)\n",
"rec_prob = sess.run(vv1_p, feed_dict={ hh0_s: hh0_s_val, W: prv_w, vb: prv_vb})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here we plot the reconstructed image:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"img = Image.fromarray(tile_raster_images(X=rec_prob, img_shape=(28, 28),tile_shape=(1, 1), tile_spacing=(1, 1)))\n",
"plt.rcParams['figure.figsize'] = (4.0, 4.0)\n",
"imgplot = plt.imshow(img)\n",
"imgplot.set_cmap('gray') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>\n",
"\n",
"## Want to learn more?\n",
"\n",
"Running deep learning programs usually needs a high performance platform. __PowerAI__ speeds up deep learning and AI. Built on IBM’s Power Systems, __PowerAI__ is a scalable software platform that accelerates deep learning and AI with blazing performance for individual users or enterprises. The __PowerAI__ platform supports popular machine learning libraries and dependencies including TensorFlow, Caffe, Torch, and Theano. You can use [PowerAI on IMB Cloud](https://cocl.us/ML0120EN_PAI).\n",
"\n",
"Also, you can use __Watson Studio__ to run these notebooks faster with bigger datasets.__Watson Studio__ is IBM’s leading cloud solution for data scientists, built by data scientists. With Jupyter notebooks, RStudio, Apache Spark and popular libraries pre-packaged in the cloud, __Watson Studio__ enables data scientists to collaborate on their projects without having to install anything. Join the fast-growing community of __Watson Studio__ users today with a free account at [Watson Studio](https://cocl.us/ML0120EN_DSX).This is the end of this lesson. Thank you for reading this notebook, and good luck on your studies."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Thanks for completing this lesson!\n",
"\n",
"Notebook created by: <a href = \"https://ca.linkedin.com/in/saeedaghabozorgi\">Saeed Aghabozorgi</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### References:\n",
"https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine \n",
"http://deeplearning.net/tutorial/rbm.html \n",
"http://www.cs.utoronto.ca/~hinton/absps/netflixICML.pdf<br>\n",
"http://imonad.com/rbm/restricted-boltzmann-machine/ \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<hr>\n",
"\n",
"Copyright &copy; 2018 [Cognitive Class](https://cocl.us/DX0108EN_CC). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
},
"widgets": {
"state": {},
"version": "1.1.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment