Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
RBM procedure using tensorflow
import tensorflow as tf
import numpy as np
import input_data
import Image
from util import tile_raster_images
def sample_prob(probs):
return tf.nn.relu(
probs - tf.random_uniform(tf.shape(probs))))
alpha = 1.0
batchsize = 100
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
trX, trY, teX, teY = mnist.train.images, mnist.train.labels, mnist.test.images,\
X = tf.placeholder("float", [None, 784])
Y = tf.placeholder("float", [None, 10])
rbm_w = tf.placeholder("float", [784, 500])
rbm_vb = tf.placeholder("float", [784])
rbm_hb = tf.placeholder("float", [500])
h0 = sample_prob(tf.nn.sigmoid(tf.matmul(X, rbm_w) + rbm_hb))
v1 = sample_prob(tf.nn.sigmoid(
tf.matmul(h0, tf.transpose(rbm_w)) + rbm_vb))
h1 = tf.nn.sigmoid(tf.matmul(v1, rbm_w) + rbm_hb)
w_positive_grad = tf.matmul(tf.transpose(X), h0)
w_negative_grad = tf.matmul(tf.transpose(v1), h1)
update_w = rbm_w + alpha * \
(w_positive_grad - w_negative_grad) / tf.to_float(tf.shape(X)[0])
update_vb = rbm_vb + alpha * tf.reduce_mean(X - v1, 0)
update_hb = rbm_hb + alpha * tf.reduce_mean(h0 - h1, 0)
h_sample = sample_prob(tf.nn.sigmoid(tf.matmul(X, rbm_w) + rbm_hb))
v_sample = sample_prob(tf.nn.sigmoid(
tf.matmul(h_sample, tf.transpose(rbm_w)) + rbm_vb))
err = X - v_sample
err_sum = tf.reduce_mean(err * err)
sess = tf.Session()
init = tf.initialize_all_variables()
n_w = np.zeros([784, 500], np.float32)
n_vb = np.zeros([784], np.float32)
n_hb = np.zeros([500], np.float32)
o_w = np.zeros([784, 500], np.float32)
o_vb = np.zeros([784], np.float32)
o_hb = np.zeros([500], np.float32)
err_sum, feed_dict={X: trX, rbm_w: o_w, rbm_vb: o_vb, rbm_hb: o_hb})
for start, end in zip(
range(0, len(trX), batchsize), range(batchsize, len(trX), batchsize)):
batch = trX[start:end]
n_w =, feed_dict={
X: batch, rbm_w: o_w, rbm_vb: o_vb, rbm_hb: o_hb})
n_vb =, feed_dict={
X: batch, rbm_w: o_w, rbm_vb: o_vb, rbm_hb: o_hb})
n_hb =, feed_dict={
X: batch, rbm_w: o_w, rbm_vb: o_vb, rbm_hb: o_hb})
o_w = n_w
o_vb = n_vb
o_hb = n_hb
if start % 10000 == 0:
err_sum, feed_dict={X: trX, rbm_w: n_w, rbm_vb: n_vb, rbm_hb: n_hb})
image = Image.fromarray(
img_shape=(28, 28),
tile_shape=(25, 20),
tile_spacing=(1, 1)
)"rbm_%d.png" % (start / 10000))
Copy link

myme5261314 commented Feb 17, 2016

The print result of reconstruction error under my attempt is


First, here's the appropriate weight visualization from DeepLearnToolbox with same configuration.


And here are the six image files. Apparently, there's something wrong with the code, but I don't know how to fix.


Copy link

myme5261314 commented Feb 19, 2016

So, now I've implemented correctly. The code issue of previous revision is the wrong use of assign and the missing of divide by batchsize for rbm_w. Now I've confirmed that the final result is right. See the weights visualization below.


Copy link

myme5261314 commented Feb 19, 2016

Current revision is far away from using rbm through tensorflow in practice.

  1. There are redundancy in the computation of data flow graph. (calculating rbm_w, rbm_vb, rbm_hb)
  2. There are too much context switching between GPU calculation and memory operation (the feed of o_w, and assign of o_w).

Copy link

keskival commented Apr 3, 2016

Thank you, a nice and clean example!

Google suggests disabling automatic calculation of gradients. I'm not sure, but I believe the gradients are calculated automatically through out the calculation steps to be used later in the backpropagation (and not lazily only if backpropagation is really done). As the gradients are calculated manually here, the automatic calculation should be switched off.

Copy link

hanhongsun commented Apr 11, 2016

Thank you for share. This is a great example for doing customized update on tensorflow.

Copy link

Cospel commented Apr 13, 2016

Wow that is great example THANK YOU. Do you know how to remove the redundancy in graph?

Im trying to implement pretraining of autoencoders with rbm:

Copy link

hanhongsun commented Apr 21, 2016

Share my code here, everything was done in one

Copy link

madzhen commented Sep 11, 2016

Line 29 is wrong: h1 = tf.nn.sigmoid(tf.matmul(v1, rbm_w) + rbm_hb) should be h1=sample_prob(tf.nn.sigmoid(tf.matmul(v1, rbm_w) + rbm_hb))

Copy link

jxyyjm commented May 4, 2017

你的sample_prob没有按照概率 采样吧。
your function named "sample_prob", is not a sample by fixed-probability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment