Last active
July 9, 2017 07:46
-
-
Save nishidy/8176548ecf37b8fab5c028aab8776b8f to your computer and use it in GitHub Desktop.
TensorFlow Multilayer Perceptron (2 hidden layers with 100 nodes respectively)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from tensorflow.examples.tutorials.mnist import input_data | |
# one_hot is for multiclass classification | |
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) | |
import tensorflow as tf | |
# Input | |
# 784 = 28 x 28 | |
# Mini batch size is not fixed here | |
x = tf.placeholder(tf.float32, [None,784]) | |
# Weight | |
# XXX Ones does not work... | |
#W1 = tf.Variable(tf.ones([784,100])) | |
#W2 = tf.Variable(tf.ones([100,100])) | |
#W3 = tf.Variable(tf.ones([100,10])) | |
# Normal distribution | |
# tf.truncated_normal([],mean,std) for | |
W1 = tf.Variable(tf.truncated_normal([784,100],0,0.1)) | |
W2 = tf.Variable(tf.truncated_normal([100,100],0,0.1)) | |
W3 = tf.Variable(tf.truncated_normal([100,10],0,0.1)) | |
# bias | |
b1 = tf.Variable(tf.zeros([100])) | |
b2 = tf.Variable(tf.zeros([100])) | |
b3 = tf.Variable(tf.zeros([10])) | |
# input : tf.matmul(x,W) + b = u | |
# output : tf.nn.softmax(u) | |
z1 = tf.nn.relu(tf.matmul(x,W1)+b1) | |
z2 = tf.nn.relu(tf.matmul(z1,W2)+b2) | |
y = tf.nn.softmax(tf.matmul(z2,W3)+b3) | |
# Define that loss function is cross entropy | |
# reduction_indices is for element-wise operation | |
y_ = tf.placeholder(tf.float32,[None,10]) | |
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y), reduction_indices=[1])) | |
# Training method | |
# learning rate = 0.5 | |
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) | |
# For parameter initialization | |
init = tf.initialize_all_variables() | |
sess = tf.Session() | |
sess.run(init) | |
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) | |
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) | |
for i in range(1000): | |
batch_xs, batch_ys = mnist.train.next_batch(100) | |
sess.run(train_step, feed_dict={x:batch_xs, y_:batch_ys}) | |
if i%50==0: | |
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) | |
I'm sorry I did not notice your comment.
Please refer to the following result that I took on my laptop MBP.
I remember that tf.truncated_normal
greatly improved this correctness.
$ python MNIST_2_hidden.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
0.2543
0.8396
0.9029
0.9011
0.9283
0.927
0.9309
0.9469
0.946
0.9523
0.9535
0.9393
0.9571
0.9571
0.9582
0.9605
0.9634
0.9608
0.9541
0.9646
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I used your code but the number of iteration to perform batch gradient is 10000, Eventually, I got different results between GPU device and CPU device. When GPU device returned 0.9xxx while CNP device returned only 0.3xxx.
Do you have any idea about this problem?