Skip to content

Instantly share code, notes, and snippets.

@McCulloughRT
Last active October 9, 2017 23:45
Show Gist options
  • Save McCulloughRT/ef072038cb1185b8c1b439e0c9750201 to your computer and use it in GitHub Desktop.
Save McCulloughRT/ef072038cb1185b8c1b439e0c9750201 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Building an Architectural Classifier - Notebook 2 - DNNs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal in this notebook is to take our pre-processed dataset of interior architectural imagery (containing images of kitchens, bathrooms, bedrooms, living rooms, etc...) and build a machine learning model that can accurately classify when it is looking at an image of a kitchen. This is the second notebook in a series, so I'll omit some of the explanatory notes on the boilerplate from before.\n",
"#### Model:\n",
"Having tried logistic regression and not being satisfied with the results, now we'll look at using deep neural nets"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import math\n",
"import time\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import tensorflow as tf\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load in the data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x: (4686, 85, 85, 3) | y: (4686, 2)\n"
]
}
],
"source": [
"x = np.load('./all_X_shuffled_85_x4686.npy')\n",
"y = np.load('./all_Y_shuffled_2_x4686.npy')\n",
"\n",
"print('x: %s | y: %s' % (x.shape, y.shape))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Split the data up into train / test / validation (80% / 10% / 10%)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def split(x, y, test=0.1, train=0.8, validation=0.1):\n",
" assert(len(x) == len(y))\n",
" test_size = int(len(x) * test)\n",
" train_size = int(len(x) * train)\n",
" valid_size = int(len(x) * validation)\n",
" \n",
" x_train = np.array(x[:train_size])\n",
" y_train = np.array(y[:train_size])\n",
" x_val = np.array(x[train_size:train_size + valid_size])\n",
" y_val = np.array(y[train_size:train_size + valid_size])\n",
" x_test = np.array(x[train_size + valid_size:])\n",
" y_test = np.array(y[train_size + valid_size:])\n",
" \n",
" return (x_train, y_train, x_val, y_val, x_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x_train: (3748, 85, 85, 3)\n",
"y_train: (3748, 2)\n",
"x_val: (468, 85, 85, 3)\n",
"y_val: (468, 2)\n",
"x_test: (470, 85, 85, 3)\n",
"y_test: (470, 2)\n"
]
}
],
"source": [
"x_train, y_train, x_val, y_val, x_test, y_test = split(x,y)\n",
"\n",
"print('x_train: ', x_train.shape)\n",
"print('y_train: ', y_train.shape)\n",
"print('x_val: ', x_val.shape)\n",
"print('y_val: ', y_val.shape)\n",
"print('x_test: ', x_test.shape)\n",
"print('y_test: ', y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Balancing\n",
"The numpy files we just imported were built in a seperate notebook, and they have already been shuffled randomly and balanced across classes, but lets check just to be sure, and to establish a baseline error:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Training balance: 0.482924226254\n",
"Validation balance: 0.480769230769\n",
"Testing balance: 0.436170212766\n"
]
}
],
"source": [
"print('Training balance: ', np.sum(y_train, axis=0)[1] / len(y_train))\n",
"print('Validation balance: ', np.sum(y_val, axis=0)[1] / len(y_val))\n",
"print('Testing balance: ', np.sum(y_test, axis=0)[1] / len(y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Define a model\n",
"## Lets start with a fairly small model, 2 hidden layers each with 100 neurons"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First a placeholder variable for our inputs to TF"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"x_input = tf.placeholder(tf.float32, [None, 85, 85, 3], name='x_input')\n",
"y_input = tf.placeholder(tf.float32, [None, 2], name='y_input')\n",
"# We'll use keep prob to feed in the dropout hyper-parameter\n",
"keep_prob = tf.placeholder(tf.float32, name='keep_prob')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now variables to hold the weights and biases"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Hidden layer one\n",
"W1 = tf.get_variable(\"W1\", [85 * 85 * 3, 100], initializer= tf.contrib.layers.xavier_initializer())\n",
"b1 = tf.get_variable(\"b1\", [100], initializer= tf.zeros_initializer())\n",
"\n",
"# Hidden layer two\n",
"W2 = tf.get_variable('W2', [100, 20], initializer= tf.contrib.layers.xavier_initializer())\n",
"b2 = tf.get_variable('b2', [20], initializer= tf.zeros_initializer())\n",
"\n",
"# Output layer\n",
"W3 = tf.get_variable('W3', [20, 2], initializer= tf.contrib.layers.xavier_initializer())\n",
"b3 = tf.get_variable('b3', [2], initializer= tf.zeros_initializer())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The model is mathematically very similar to the logistic regression model, this time we apply the same matmul operation across multiple layers successively, feeding the output of one into the input of the next. Notice that our weights matrix is much larger, reflecting the full connection of all weights to all neurons. Finally note the tf.nn.relu() function, this applies a non-linearity to each neuron allowing it to model more flexible decision boundaries.\n",
"$$ logits = x \\boldsymbol{\\cdot} W + b $$"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# flatten the input\n",
"flat_input = tf.reshape(x_input, [-1, 85 * 85 * 3])\n",
"\n",
"# Hidden layer one\n",
"activations_one = tf.nn.relu(tf.add(tf.matmul(flat_input, W1), b1), name='activations_one')\n",
"dropout_one = tf.nn.dropout(activations_one, keep_prob, name='dropout_one')\n",
"\n",
"# Hidden layer two\n",
"activations_two = tf.nn.relu(tf.add(tf.matmul(dropout_one, W2), b2), name='activations_two')\n",
"dropout_two = tf.nn.dropout(activations_two, keep_prob, name='dropout_two')\n",
"\n",
"# Output layer\n",
"logits = tf.add(tf.matmul(dropout_two, W3), b3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll still be using softmax cross-entropy for our cost function. Since we're using dropout regularization we'll eliminate the L2 penalty, this isn't a rule, it may come back into play later but its easier to manage one hyper-parameter at a time."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = y_input))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# The argument to Adam here is the learning rate and it can (and should) be experimented with\n",
"training_step = tf.train.AdamOptimizer(1e-5).minimize(cross_entropy)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Last, I'll set up an accuracy metric to validate on and add a summary so we can watch it train on tensorboard"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"<tf.Tensor 'accuracy:0' shape=() dtype=string>"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions = tf.argmax(logits, axis=1)\n",
"truths = tf.argmax(y_input, axis=1)\n",
"correct_predictions = tf.equal(predictions, truths)\n",
"accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))\n",
"tf.summary.scalar('accuracy', accuracy)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Finally lets start a session and begin training"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"sess = tf.Session()\n",
"init = tf.global_variables_initializer()\n",
"sess.run(init)\n",
"merged = tf.summary.merge_all()\n",
"train_writer = tf.summary.FileWriter('logs/deep_softmax/train', sess.graph)\n",
"valid_writer = tf.summary.FileWriter('logs/deep_softmax/valid', sess.graph)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 10 | Time per epoch: 0.726 | Train Accuracy: 0.525347 | Validation Accuracy: 0.508547\n",
"Epoch 20 | Time per epoch: 0.689 | Train Accuracy: 0.532818 | Validation Accuracy: 0.504274\n",
"Epoch 30 | Time per epoch: 0.674 | Train Accuracy: 0.537353 | Validation Accuracy: 0.482906\n",
"Epoch 40 | Time per epoch: 0.667 | Train Accuracy: 0.534685 | Validation Accuracy: 0.529915\n",
"Epoch 50 | Time per epoch: 0.664 | Train Accuracy: 0.541889 | Validation Accuracy: 0.547009\n",
"\n",
"...Edited for length in this gist, see tensorboard summary below for full training details...",
"\n",
"Epoch 4950 | Time per epoch: 0.647 | Train Accuracy: 1.0 | Validation Accuracy: 0.619658\n",
"Epoch 4960 | Time per epoch: 0.647 | Train Accuracy: 1.0 | Validation Accuracy: 0.621795\n",
"Epoch 4970 | Time per epoch: 0.647 | Train Accuracy: 1.0 | Validation Accuracy: 0.619658\n",
"Epoch 4980 | Time per epoch: 0.647 | Train Accuracy: 1.0 | Validation Accuracy: 0.619658\n",
"Epoch 4990 | Time per epoch: 0.647 | Train Accuracy: 1.0 | Validation Accuracy: 0.621795\n",
"Time Taken: 3235.950849533081\n"
]
}
],
"source": [
"num_epochs = 5000\n",
"\n",
"start_time = time.time()\n",
"for epoch in range(num_epochs):\n",
" if epoch % 10 == 0:\n",
" # every 10th epoch write out accuracy on training and validation\n",
" summary_train, train_acc = sess.run([merged, accuracy], {x_input: x_train, y_input: y_train, keep_prob: 1.0})\n",
" summary_valid, valid_acc = sess.run([merged, accuracy], {x_input: x_val, y_input: y_val, keep_prob: 1.0})\n",
" train_writer.add_summary(summary_train, epoch)\n",
" valid_writer.add_summary(summary_valid, epoch)\n",
" if epoch != 0: \n",
" time_taken = round((time.time() - start_time) / epoch, 3)\n",
" print('Epoch %s | Time per epoch: %s | Train Accuracy: %s | Validation Accuracy: %s' % (epoch, time_taken, train_acc, valid_acc))\n",
" sess.run([training_step], {x_input: x_train, y_input: y_train, keep_prob: 0.5})\n",
"\n",
"train_writer.close()\n",
"print('Time Taken: ', time.time() - start_time)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Looking better! :)\n",
"[You can view the tensorboard summary here.](https://raw.githubusercontent.com/McCulloughRT/machine-learning/master/dnn_training.png) Purple is the validation accuracy and blue training.\n",
"\n",
"We reached 100% training accuracy, so the model is easily complex enough. However while our validation error did improve significantly, it didn't keep pace with the training error, showing a big overfitting issue even with dropout regularization.\n",
"#### Lets check the models accuracy against the training set"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.6510638"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sess.run(accuracy, {x_input: x_test, y_input: y_test, keep_prob:1.0})"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"sess.close()"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [conda root]",
"language": "python",
"name": "conda-root-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment