Create a gist now

Instantly share code, notes, and snippets.

@nishidy /mp.py
Last active Jul 9, 2017

What would you like to do?
TensorFlow Multilayer Perceptron (2 hidden layers with 100 nodes respectively)
from tensorflow.examples.tutorials.mnist import input_data
# one_hot is for multiclass classification
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tensorflow as tf
# Input
# 784 = 28 x 28
# Mini batch size is not fixed here
x = tf.placeholder(tf.float32, [None,784])
# Weight
# XXX Ones does not work...
#W1 = tf.Variable(tf.ones([784,100]))
#W2 = tf.Variable(tf.ones([100,100]))
#W3 = tf.Variable(tf.ones([100,10]))
# Normal distribution
# tf.truncated_normal([],mean,std) for
W1 = tf.Variable(tf.truncated_normal([784,100],0,0.1))
W2 = tf.Variable(tf.truncated_normal([100,100],0,0.1))
W3 = tf.Variable(tf.truncated_normal([100,10],0,0.1))
# bias
b1 = tf.Variable(tf.zeros([100]))
b2 = tf.Variable(tf.zeros([100]))
b3 = tf.Variable(tf.zeros([10]))
# input : tf.matmul(x,W) + b = u
# output : tf.nn.softmax(u)
z1 = tf.nn.relu(tf.matmul(x,W1)+b1)
z2 = tf.nn.relu(tf.matmul(z1,W2)+b2)
y = tf.nn.softmax(tf.matmul(z2,W3)+b3)
# Define that loss function is cross entropy
# reduction_indices is for element-wise operation
y_ = tf.placeholder(tf.float32,[None,10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y), reduction_indices=[1]))
# Training method
# learning rate = 0.5
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
# For parameter initialization
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x:batch_xs, y_:batch_ys})
if i%50==0:
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

I used your code but the number of iteration to perform batch gradient is 10000, Eventually, I got different results between GPU device and CPU device. When GPU device returned 0.9xxx while CNP device returned only 0.3xxx.
Do you have any idea about this problem?

Owner

nishidy commented Jul 9, 2017

I'm sorry I did not notice your comment.
Please refer to the following result that I took on my laptop MBP.
I remember that tf.truncated_normal greatly improved this correctness.

$ python MNIST_2_hidden.py 
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
0.2543
0.8396
0.9029
0.9011
0.9283
0.927
0.9309
0.9469
0.946
0.9523
0.9535
0.9393
0.9571
0.9571
0.9582
0.9605
0.9634
0.9608
0.9541
0.9646
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment