Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
A mix of autoencoder and a classifier with Tensorflow
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# An autoencoder with classification for MNIST\n",
"Or\n",
"\n",
"## How to create an autoencoder and monitor its embedded space via a classifier\n",
"\n",
"This question was raised on Stack Overflow some weeks ago for *Keras*, and I thought it was a good way to reuse elements from [my recent book](http://blog.audio-tk.com/2018/09/04/book-building-machine-learning-systems-withpython-third-edition/).\n",
"\n",
"Mainly, we will use the MNIST dataset to train an autoencoder, and as a second step we will see how we can add on top of it a classifier that can give us some information on the quality of the embedded space for classification purposes.\n",
"\n",
"So what we will reuse is:\n",
"* the autoencoder concept\n",
"* part of the discriminator and the generator from our GAN\n",
"* the MNIST classifier final layers\n",
"* Tensorboard usage"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrieving the data\n",
"\n",
"Let's start with some common code for the two steps."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"import numpy as np\n",
"from matplotlib import pyplot as plt\n",
"from matplotlib import offsetbox\n",
"from sklearn.metrics import confusion_matrix\n",
"\n",
"%matplotlib inline\n",
"\n",
"n_epochs = 10\n",
"learning_rate = 0.0002\n",
"batch_size = 128\n",
"image_shape = [28,28,1]\n",
"step = batch_size * 100\n",
"dim_W1 = 128\n",
"dim_W2 = 64\n",
"dim_W3 = 32\n",
"dim_C1 = 16\n",
"\n",
"dim_embedded = 2"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def plot_confusion_matrix(cm, genre_list, title):\n",
" plt.figure(num=None, figsize=(5, 4))\n",
" ax = plt.axes()\n",
" im = ax.matshow(cm, cmap='Blues', vmin=0, vmax=1.0)\n",
" ax.set_xticks(range(len(genre_list)))\n",
" ax.set_xticklabels(genre_list)\n",
" ax.xaxis.set_ticks_position(\"bottom\")\n",
" ax.set_yticks(range(len(genre_list)))\n",
" ax.set_yticklabels(genre_list)\n",
" ax.tick_params(axis='both', which='both', bottom=False, left=False)\n",
" plt.title(title)\n",
" plt.colorbar(im, ax=ax)\n",
" plt.grid(False)\n",
" plt.xlabel('Predicted class')\n",
" plt.ylabel('True class')\n",
" \n",
"def plot_embedding(X, labels, title=None):\n",
" x_min, x_max = np.min(X, 0), np.max(X, 0)\n",
" X = (X - x_min) / (x_max - x_min)\n",
"\n",
" plt.figure()\n",
" ax = plt.subplot(111)\n",
" for i in range(X.shape[0]):\n",
" plt.text(X[i, 0], X[i, 1], str(labels[i]),\n",
" color=plt.cm.Set1(labels[i] / 10.),\n",
" fontdict={'weight': 'bold', 'size': 9})\n",
" plt.xticks([]), plt.yticks([])\n",
" if title is not None:\n",
" plt.title(title)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING:tensorflow:From <ipython-input-3-2aed5ca29197>:2: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use alternatives such as official/mnist/dataset.py from tensorflow/models.\n",
"WARNING:tensorflow:From /home/matthieu/miniconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please write your own downloading logic.\n",
"WARNING:tensorflow:From /home/matthieu/miniconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use tf.data to implement this functionality.\n",
"Extracting MNIST_data/train-images-idx3-ubyte.gz\n",
"WARNING:tensorflow:From /home/matthieu/miniconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use tf.data to implement this functionality.\n",
"Extracting MNIST_data/train-labels-idx1-ubyte.gz\n",
"Extracting MNIST_data/t10k-images-idx3-ubyte.gz\n",
"Extracting MNIST_data/t10k-labels-idx1-ubyte.gz\n",
"WARNING:tensorflow:From /home/matthieu/miniconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use alternatives such as official/mnist/dataset.py from tensorflow/models.\n"
]
}
],
"source": [
"from tensorflow.examples.tutorials.mnist import input_data\n",
"mnist = input_data.read_data_sets(\"MNIST_data/\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The future-proof way of getting MNIST for tensorflow can be found at https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py\n",
"\n",
"We know reshape the images as we need:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"mnist.train.images.shape = -1, 28, 28, 1\n",
"mnist.test.images.shape = -1, 28, 28, 1\n",
"\n",
"num_train = mnist.train.images.shape[0]\n",
"num_test = mnist.test.images.shape[0]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def batchnormalize(X, eps=1e-8, g=None, b=None):\n",
" if X.get_shape().ndims == 4:\n",
" mean = tf.reduce_mean(X, [0,1,2])\n",
" std = tf.reduce_mean( tf.square(X-mean), [0,1,2] )\n",
" X = (X-mean) / tf.sqrt(std+eps)\n",
"\n",
" if g is not None and b is not None:\n",
" g = tf.reshape(g, [1,1,1,-1])\n",
" b = tf.reshape(b, [1,1,1,-1])\n",
" X = X*g + b\n",
"\n",
" elif X.get_shape().ndims == 2:\n",
" mean = tf.reduce_mean(X, 0)\n",
" std = tf.reduce_mean(tf.square(X-mean), 0)\n",
" X = (X-mean) / tf.sqrt(std+eps)\n",
"\n",
" if g is not None and b is not None:\n",
" g = tf.reshape(g, [1,-1])\n",
" b = tf.reshape(b, [1,-1])\n",
" X = X*g + b\n",
"\n",
" else:\n",
" raise NotImplementedError\n",
"\n",
" return X\n",
"\n",
"class AutoEncoder():\n",
" def __init__(\n",
" self,\n",
" image_shape=[28,28,1],\n",
" dim_W1=1024,\n",
" dim_W2=128,\n",
" dim_W3=64,\n",
" dim_embedded=2,\n",
" ):\n",
"\n",
" self.image_shape = image_shape\n",
"\n",
" self.dim_W1 = dim_W1\n",
" self.dim_W2 = dim_W2\n",
" self.dim_W3 = dim_W3\n",
" self.dim_embedded = dim_embedded\n",
"\n",
" def build_model(self):\n",
"\n",
" image = tf.placeholder(tf.float32, [None]+self.image_shape)\n",
" embedded = self.encode(image)\n",
" decoded = self.decode(embedded)\n",
" \n",
" # We clip the output, as 0 and 1 cannot be achieved with a sigmoid output\n",
" logits = tf.clip_by_value(decoded, 1e-7, 1. - 1e-7)\n",
" \n",
" cost_autoencoder = tf.reduce_mean(tf.square(logits - image))\n",
"\n",
" summaries = tf.summary.merge([\n",
" tf.summary.scalar(\"loss/train\", cost_autoencoder),\n",
" ])\n",
" summaries_test = tf.summary.merge([\n",
" tf.summary.scalar(\"loss/test\", cost_autoencoder),\n",
" ])\n",
"\n",
" return image, embedded, decoded, cost_autoencoder, summaries, summaries_test\n",
"\n",
" def create_conv2d(self, input, filters, kernel_size, name):\n",
" layer = tf.layers.conv2d(\n",
" inputs=input,\n",
" filters=filters,\n",
" kernel_size=kernel_size,\n",
" strides=[2,2],\n",
" name=\"Conv2d_\" + name,\n",
" padding=\"SAME\")\n",
" layer = tf.nn.leaky_relu(layer, name= \"LeakyRELU\" + name)\n",
" return layer\n",
"\n",
" def create_conv2d_transpose(self, input, filters, kernel_size, name, with_batch_norm):\n",
" layer = tf.layers.conv2d_transpose(\n",
" inputs=input,\n",
" filters=filters,\n",
" kernel_size=kernel_size,\n",
" strides=[2,2],\n",
" name=\"Conv2d_\" + name,\n",
" padding=\"SAME\")\n",
" if with_batch_norm:\n",
" layer = batchnormalize(layer)\n",
" layer = tf.nn.relu(layer)\n",
" return layer\n",
"\n",
" def create_dense(self, input, units, name, leaky):\n",
" layer = tf.layers.dense(\n",
" inputs=input,\n",
" units=units,\n",
" name=\"Dense\" + name,\n",
" )\n",
" layer = batchnormalize(layer)\n",
" if leaky:\n",
" layer = tf.nn.leaky_relu(layer, name= \"LeakyRELU\" + name)\n",
" else:\n",
" layer = tf.nn.relu(layer, name=\"RELU_\" + name)\n",
" return layer\n",
"\n",
" def encode(self, image):\n",
" with tf.variable_scope('encoder'):\n",
" h1 = self.create_conv2d(image, self.dim_W3, 5, \"Layer1\")\n",
" \n",
" h2 = self.create_conv2d(h1, self.dim_W2, 5, \"Layer2\")\n",
" h2 = tf.reshape(h2, tf.stack([-1, 7*7*self.dim_W2]))\n",
" \n",
" h3 = self.create_dense(h2, self.dim_W1, \"Layer3\", True)\n",
" \n",
" h4 = self.create_dense(h3, self.dim_embedded, \"Layer4\", True)\n",
" return h4\n",
"\n",
" def decode(self, embedded):\n",
" with tf.variable_scope('decoder'):\n",
"\n",
" h1 = self.create_dense(embedded, self.dim_W1, \"Layer1\", False)\n",
"\n",
" h2 = self.create_dense(h1, self.dim_W2*7*7, \"Layer2\", False)\n",
" h2 = tf.reshape(h2, tf.stack([-1,7,7,self.dim_W2]))\n",
"\n",
" h3 = self.create_conv2d_transpose(h2, self.dim_W3, 5, \"Layer3\", True)\n",
"\n",
" h4 = self.create_conv2d_transpose(h3, 1, 7, \"Layer4\", False)\n",
" x = tf.nn.sigmoid(h4)\n",
" return x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get our model now."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"tf.reset_default_graph()\n",
"autoencoder_model = AutoEncoder(\n",
" image_shape=image_shape,\n",
" dim_W1=dim_W1,\n",
" dim_W2=dim_W2,\n",
" dim_W3=dim_W3,\n",
" dim_embedded=dim_embedded\n",
" )\n",
"image, embedded, decoded, cost_autoencoder, summaries, summaries_test = autoencoder_model.build_model()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"train_op_autoencoder = tf.train.AdamOptimizer(learning_rate, beta1=0.5).minimize(cost_autoencoder)\n",
"summary_writer = tf.summary.FileWriter(\"/tmp/tensorboard/part1\", tf.get_default_graph())"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch: 0\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 1\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 2\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 3\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 4\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 5\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 6\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 7\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 8\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 9\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n"
]
}
],
"source": [
"with tf.Session() as sess:\n",
" sess.run(tf.global_variables_initializer())\n",
" for epoch in range(n_epochs):\n",
" permut = np.random.permutation(num_train)\n",
" trX = mnist.train.images[permut]\n",
"\n",
" print(\"epoch: %i\" % epoch)\n",
" for j in range(0, num_train, batch_size):\n",
" if j % step == 0:\n",
" print(\" batch: %i\" % j)\n",
"\n",
" batch = permut[j:j+batch_size]\n",
"\n",
" Xs = trX[batch]\n",
"\n",
" _, local_summaries = sess.run([train_op_autoencoder, summaries],\n",
" feed_dict={\n",
" image:Xs,\n",
" })\n",
" summary_writer.add_summary(local_summaries, epoch * num_train + j)\n",
" local_test_summaries = sess.run(summaries_test,\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" })\n",
" summary_writer.add_summary(local_test_summaries, epoch * num_train)\n",
"\n",
" embedded_space_train = sess.run(embedded,\n",
" feed_dict={\n",
" image:mnist.train.images,\n",
" })\n",
" embedded_space_test = sess.run(embedded,\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" })"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_embedding(embedded_space_train, mnist.train.labels)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_embedding(embedded_space_test, mnist.test.labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Adding a classification accuracy measure on the fly\n",
"\n",
"Let's modify now the autoencoder to also add a classification layer."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"class AutoEncoderClassifier():\n",
" def __init__(\n",
" self,\n",
" image_shape=[28,28,1],\n",
" dim_W1=128,\n",
" dim_W2=64,\n",
" dim_W3=32,\n",
" dim_embedded=2,\n",
" dim_C1=32,\n",
" ):\n",
"\n",
" self.image_shape = image_shape\n",
"\n",
" self.dim_W1 = dim_W1\n",
" self.dim_W2 = dim_W2\n",
" self.dim_W3 = dim_W3\n",
" self.dim_embedded = dim_embedded\n",
" self.dim_C1 = dim_C1\n",
"\n",
" def build_model(self):\n",
"\n",
" image = tf.placeholder(tf.float32, [None]+self.image_shape)\n",
" label = tf.placeholder(tf.int64, [None])\n",
" embedded = self.encode(image)\n",
" decoded = self.decode(embedded)\n",
" classified = self.classify(embedded)\n",
" \n",
" # We clip the output, as 0 and 1 cannot be achieved with a sigmoid output\n",
" logits = tf.clip_by_value(decoded, 1e-7, 1. - 1e-7)\n",
" \n",
" cost_autoencoder = tf.reduce_mean(tf.square(logits - image))\n",
" \n",
" cost_classify = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=label, logits=classified))\n",
" accuracy_classify = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(classified, axis=1), label), tf.float32), name=\"accuracy\")\n",
" \n",
" summaries_train = tf.summary.merge([\n",
" tf.summary.scalar(\"loss/train\", cost_autoencoder),\n",
" ])\n",
" summaries_classifier = tf.summary.merge([\n",
" tf.summary.scalar(\"accuracy/train\", accuracy_classify),\n",
" ])\n",
" summaries_all_test = tf.summary.merge([\n",
" tf.summary.scalar(\"loss/test\", cost_autoencoder),\n",
" tf.summary.scalar(\"accuracy/test\", accuracy_classify),\n",
" ])\n",
" \n",
" return image, label, embedded, decoded, classified, cost_autoencoder, cost_classify, accuracy_classify, summaries_train, summaries_classifier, summaries_all_test\n",
"\n",
" def create_conv2d(self, input, filters, kernel_size, name):\n",
" layer = tf.layers.conv2d(\n",
" inputs=input,\n",
" filters=filters,\n",
" kernel_size=kernel_size,\n",
" strides=[2,2],\n",
" name=\"Conv2d_\" + name,\n",
" padding=\"SAME\")\n",
" layer = tf.nn.leaky_relu(layer, name= \"LeakyRELU\" + name)\n",
" return layer\n",
"\n",
" def create_conv2d_transpose(self, input, filters, kernel_size, name, with_batch_norm):\n",
" layer = tf.layers.conv2d_transpose(\n",
" inputs=input,\n",
" filters=filters,\n",
" kernel_size=kernel_size,\n",
" strides=[2,2],\n",
" name=\"Conv2d_\" + name,\n",
" padding=\"SAME\")\n",
" if with_batch_norm:\n",
" layer = batchnormalize(layer)\n",
" layer = tf.nn.relu(layer)\n",
" return layer\n",
"\n",
" def create_dense(self, input, units, name, leaky):\n",
" layer = tf.layers.dense(\n",
" inputs=input,\n",
" units=units,\n",
" name=\"Dense\" + name,\n",
" )\n",
" layer = batchnormalize(layer)\n",
" if leaky:\n",
" layer = tf.nn.leaky_relu(layer, name= \"LeakyRELU\" + name)\n",
" else:\n",
" layer = tf.nn.relu(layer, name=\"RELU_\" + name)\n",
" return layer\n",
"\n",
" def encode(self, image):\n",
" with tf.variable_scope('encoder'):\n",
" h1 = self.create_conv2d(image, self.dim_W3, 5, \"Layer1\")\n",
" \n",
" h2 = self.create_conv2d(h1, self.dim_W2, 5, \"Layer2\")\n",
" h2 = tf.reshape(h2, tf.stack([-1, 7*7*self.dim_W2]))\n",
" \n",
" h3 = self.create_dense(h2, self.dim_W1, \"Layer3\", True)\n",
" \n",
" h4 = self.create_dense(h3, self.dim_embedded, \"Layer4\", True)\n",
" return h4\n",
"\n",
" def decode(self, embedded):\n",
" with tf.variable_scope('decoder'):\n",
"\n",
" h1 = self.create_dense(embedded, self.dim_W1, \"Layer1\", False)\n",
"\n",
" h2 = self.create_dense(h1, self.dim_W2*7*7, \"Layer2\", False)\n",
" h2 = tf.reshape(h2, tf.stack([-1,7,7,self.dim_W2]))\n",
"\n",
" h3 = self.create_conv2d_transpose(h2, self.dim_W3, 5, \"Layer3\", True)\n",
"\n",
" h4 = self.create_conv2d_transpose(h3, 1, 7, \"Layer4\", False)\n",
" x = tf.nn.sigmoid(h4)\n",
" return x\n",
" \n",
"\n",
" def classify(self, embedded):\n",
" with tf.variable_scope('classifier'):\n",
"\n",
" h1 = self.create_dense(embedded, self.dim_C1, \"Layer1\", False)\n",
"\n",
" h2 = self.create_dense(h1, 10, \"Layer2\", False)\n",
" return h2"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"tf.reset_default_graph()\n",
"autoencoder_classifier_model = AutoEncoderClassifier(\n",
" image_shape=image_shape,\n",
" dim_W1=dim_W1,\n",
" dim_W2=dim_W2,\n",
" dim_W3=dim_W3,\n",
" dim_embedded=dim_embedded,\n",
" dim_C1=dim_C1,\n",
" )\n",
"\n",
"image, label, embedded, decoded, classified, cost_autoencoder, cost_classify, accuracy_classify, summaries_train, summaries_classifier, summaries_all_test = autoencoder_classifier_model.build_model()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"encode_vars = list(filter(lambda x: x.name.startswith('encode'), tf.trainable_variables()))\n",
"decode_vars = list(filter(lambda x: x.name.startswith('decode'), tf.trainable_variables()))\n",
"classify_vars = list(filter(lambda x: x.name.startswith('classifier'), tf.trainable_variables()))\n",
"\n",
"train_op_autoencoder = tf.train.AdamOptimizer(learning_rate, beta1=0.5).minimize(cost_autoencoder, var_list=encode_vars+decode_vars)\n",
"train_op_classify = tf.train.AdamOptimizer(learning_rate, beta1=0.5).minimize(cost_classify, var_list=classify_vars)\n",
"\n",
"summary_writer = tf.summary.FileWriter(\"/tmp/tensorboard/part2\", tf.get_default_graph())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As for the GAN case, we now alternate between optimizing the autoencoder and then the classifier."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch: 0\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 1\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 2\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 3\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 4\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 5\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 6\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 7\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 8\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 9\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 10\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 11\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 12\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 13\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 14\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 15\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 16\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 17\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 18\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 19\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"Cost for the training set: 0.038948\n",
"Cost for the testing set: 0.039474\n"
]
}
],
"source": [
"with tf.Session() as sess:\n",
" sess.run(tf.global_variables_initializer())\n",
" for epoch in range(2*n_epochs):\n",
" permut = np.random.permutation(num_train)\n",
" trX = mnist.train.images[permut]\n",
" trY = mnist.train.labels[permut]\n",
"\n",
" print(\"epoch: %i\" % epoch)\n",
" for j in range(0, num_train, batch_size):\n",
" if j % step == 0:\n",
" print(\" batch: %i\" % j)\n",
"\n",
" batch = permut[j:j+batch_size]\n",
"\n",
" Xs = trX[batch]\n",
" Ys = trY[batch]\n",
"\n",
" if j % (2 * batch_size) == 0:\n",
" _, local_summaries = sess.run([train_op_autoencoder, summaries_train],\n",
" feed_dict={\n",
" image:Xs,\n",
" })\n",
" summary_writer.add_summary(local_summaries, epoch * num_train + j)\n",
" else:\n",
" _, local_summary_1, local_summary_2 = sess.run([train_op_classify, summaries_train, summaries_classifier],\n",
" feed_dict={\n",
" image:Xs,\n",
" label:Ys,\n",
" })\n",
" summary_writer.add_summary(local_summary_1, epoch * num_train + j)\n",
" summary_writer.add_summary(local_summary_2, epoch * num_train + j)\n",
"\n",
" local_test_summaries = sess.run(summaries_all_test,\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" label:mnist.test.labels,\n",
" })\n",
" summary_writer.add_summary(local_test_summaries, epoch * num_train)\n",
" \n",
"\n",
" embedded_space_train, cost_train = sess.run([embedded, cost_autoencoder],\n",
" feed_dict={\n",
" image:mnist.train.images,\n",
" })\n",
" embedded_space_test, classify_test, cost_test = sess.run([embedded, classified, cost_autoencoder],\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" })\n",
"\n",
"print(\"Cost for the training set: %f\" % cost_train)\n",
"print(\"Cost for the testing set: %f\" % cost_test)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we display the result, we can see that we get quite a good result for classification in 2D, although the autoencoder loss is quite bad. This is logical considering that 2 dimensions are not enough to represent the complexit of digits!"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 360x288 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class_test = np.argmax(classify_test, axis=1)\n",
"cm = confusion_matrix(mnist.test.labels, class_test)\n",
"plot_confusion_matrix(cm / np.sum(cm, axis=0), list(range(10)), \"Autoencoder performance for classification in 2D\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's see what happens in 3D"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"tf.reset_default_graph()\n",
"autoencoder_classifier_model = AutoEncoderClassifier(\n",
" image_shape=image_shape,\n",
" dim_W1=dim_W1,\n",
" dim_W2=dim_W2,\n",
" dim_W3=dim_W3,\n",
" dim_embedded=3,\n",
" dim_C1=dim_C1,\n",
" )\n",
"\n",
"image, label, embedded, decoded, classified, cost_autoencoder, cost_classify, accuracy_classify, summaries_train, summaries_classifier, summaries_all_test = autoencoder_classifier_model.build_model()"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"encode_vars = list(filter(lambda x: x.name.startswith('encode'), tf.trainable_variables()))\n",
"decode_vars = list(filter(lambda x: x.name.startswith('decode'), tf.trainable_variables()))\n",
"classify_vars = list(filter(lambda x: x.name.startswith('classifier'), tf.trainable_variables()))\n",
"\n",
"train_op_autoencoder = tf.train.AdamOptimizer(learning_rate, beta1=0.5).minimize(cost_autoencoder, var_list=encode_vars+decode_vars)\n",
"train_op_classify = tf.train.AdamOptimizer(learning_rate, beta1=0.5).minimize(cost_classify, var_list=classify_vars)\n",
"\n",
"summary_writer = tf.summary.FileWriter(\"/tmp/tensorboard/part3\", tf.get_default_graph())"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"epoch: 0\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 1\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 2\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 3\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 4\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 5\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 6\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 7\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 8\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 9\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 10\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 11\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 12\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 13\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 14\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 15\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 16\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 17\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 18\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"epoch: 19\n",
" batch: 0\n",
" batch: 12800\n",
" batch: 25600\n",
" batch: 38400\n",
" batch: 51200\n",
"Cost for the training set: 0.032497\n",
"Cost for the testing set: 0.033085\n"
]
}
],
"source": [
"with tf.Session() as sess:\n",
" sess.run(tf.global_variables_initializer())\n",
" for epoch in range(2*n_epochs):\n",
" permut = np.random.permutation(num_train)\n",
" trX = mnist.train.images[permut]\n",
" trY = mnist.train.labels[permut]\n",
"\n",
" print(\"epoch: %i\" % epoch)\n",
" for j in range(0, num_train, batch_size):\n",
" if j % step == 0:\n",
" print(\" batch: %i\" % j)\n",
"\n",
" batch = permut[j:j+batch_size]\n",
"\n",
" Xs = trX[batch]\n",
" Ys = trY[batch]\n",
"\n",
" if j % (2 * batch_size) == 0:\n",
" _, local_summaries = sess.run([train_op_autoencoder, summaries_train],\n",
" feed_dict={\n",
" image:Xs,\n",
" })\n",
" summary_writer.add_summary(local_summaries, epoch * num_train + j)\n",
" else:\n",
" _, local_summary_1, local_summary_2 = sess.run([train_op_classify, summaries_train, summaries_classifier],\n",
" feed_dict={\n",
" image:Xs,\n",
" label:Ys,\n",
" })\n",
" summary_writer.add_summary(local_summary_1, epoch * num_train + j)\n",
" summary_writer.add_summary(local_summary_2, epoch * num_train + j)\n",
"\n",
" local_test_summaries = sess.run(summaries_all_test,\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" label:mnist.test.labels,\n",
" })\n",
" summary_writer.add_summary(local_test_summaries, epoch * num_train)\n",
"\n",
" embedded_space_train, cost_train = sess.run([embedded, cost_autoencoder],\n",
" feed_dict={\n",
" image:mnist.train.images,\n",
" })\n",
" embedded_space_test, classify_test, cost_test = sess.run([embedded, classified, cost_autoencoder],\n",
" feed_dict={\n",
" image:mnist.test.images,\n",
" })\n",
"\n",
"print(\"Cost for the training set: %f\" % cost_train)\n",
"print(\"Cost for the testing set: %f\" % cost_test)\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 360x288 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class_test = np.argmax(classify_test, axis=1)\n",
"cm = confusion_matrix(mnist.test.labels, class_test)\n",
"plot_confusion_matrix(cm / np.sum(cm, axis=0), list(range(10)), \"Autoencoder performance for classification in 3D\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.