Created
December 5, 2023 07:42
-
-
Save DiogenesAnalytics/399a96031ac360da5c34a000f3eae4ea to your computer and use it in GitHub Desktop.
Keras Autoencoder Jupyter Notebook Tutorials
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "1b2cf03f-8eaf-4adf-80ab-0d5d5c661deb", | |
"metadata": {}, | |
"source": [ | |
"# Building Autoencoders in Keras\n", | |
"The following `Jupyter Notebook` has been *adapted* from the [Keras blog article](https://blog.keras.io/building-autoencoders-in-keras.html) written by *F. Chollet* on [autoencoders](https://en.wikipedia.org/wiki/Autoencoder)." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "06d935a8-5cb4-466b-ad97-aedf12265338", | |
"metadata": {}, | |
"source": [ | |
"## Convolutional Autoencoder\n", | |
"Since our inputs are images, it makes sense to use convolutional neural networks (convnets) as encoders and decoders. In practical settings, autoencoders applied to images are always convolutional autoencoders --they simply perform much better.\n", | |
"\n", | |
"Let's implement one. The encoder will consist in a stack of Conv2D and MaxPooling2D layers (max pooling being used for spatial down-sampling), while the decoder will consist in a stack of Conv2D and UpSampling2D layers." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "b5f302f4-45bc-482c-8016-63a0e698d6f6", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# get initial libs\n", | |
"import keras\n", | |
"from keras import layers\n", | |
"from keras.datasets import mnist\n", | |
"import numpy as np\n", | |
"import visualization.image" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "e743e762-1c41-4b32-b6db-a63db7c25c7c", | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [], | |
"source": [ | |
"input_img = keras.Input(shape=(28, 28, 1))\n", | |
"\n", | |
"x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)\n", | |
"x = layers.MaxPooling2D((2, 2), padding='same')(x)\n", | |
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n", | |
"x = layers.MaxPooling2D((2, 2), padding='same')(x)\n", | |
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n", | |
"encoded = layers.MaxPooling2D((2, 2), padding='same')(x)\n", | |
"\n", | |
"# at this point the representation is (4, 4, 8) i.e. 128-dimensional\n", | |
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)\n", | |
"x = layers.UpSampling2D((2, 2))(x)\n", | |
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n", | |
"x = layers.UpSampling2D((2, 2))(x)\n", | |
"x = layers.Conv2D(16, (3, 3), activation='relu')(x)\n", | |
"x = layers.UpSampling2D((2, 2))(x)\n", | |
"decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)\n", | |
"\n", | |
"autoencoder = keras.Model(input_img, decoded)\n", | |
"autoencoder.compile(optimizer='adam', loss='binary_crossentropy')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "58b56c97-6532-41a2-96bc-a8f589b20a84", | |
"metadata": {}, | |
"source": [ | |
"To train it, we will use the original MNIST digits with shape (samples, 3, 28, 28), and we will just normalize pixel values between 0 and 1." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "96970532-a0b4-40f7-8225-a770bff0d42b", | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [], | |
"source": [ | |
"(x_train, _), (x_test, _) = mnist.load_data()\n", | |
"\n", | |
"x_train = x_train.astype('float32') / 255.\n", | |
"x_test = x_test.astype('float32') / 255.\n", | |
"x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))\n", | |
"x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "2c10a00e-73ad-469e-869b-85f902d09533", | |
"metadata": { | |
"tags": [] | |
}, | |
"outputs": [], | |
"source": [ | |
"autoencoder.fit(x_train, x_train,\n", | |
" epochs=50,\n", | |
" batch_size=128,\n", | |
" shuffle=True,\n", | |
" validation_data=(x_test, x_test))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "9b7e1872-ad71-4d3b-bbd9-b9b6efb09982", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"visualization.image.compare_results(x_test, autoencoder.predict(x_test));" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "a1955fa8-c2d9-42dc-98b8-84e82616dc9f", | |
"metadata": {}, | |
"source": [ | |
"The model converges to a loss of 0.094, significantly better than our previous models (this is in large part due to the higher entropic capacity of the encoded representation, 128 dimensions vs. 32 previously). " | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.10.11" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment