Skip to content

Instantly share code, notes, and snippets.

@DiogenesAnalytics
Created December 5, 2023 07:42
Show Gist options
  • Save DiogenesAnalytics/399a96031ac360da5c34a000f3eae4ea to your computer and use it in GitHub Desktop.
Save DiogenesAnalytics/399a96031ac360da5c34a000f3eae4ea to your computer and use it in GitHub Desktop.
Keras Autoencoder Jupyter Notebook Tutorials
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "1b2cf03f-8eaf-4adf-80ab-0d5d5c661deb",
"metadata": {},
"source": [
"# Building Autoencoders in Keras\n",
"The following `Jupyter Notebook` has been *adapted* from the [Keras blog article](https://blog.keras.io/building-autoencoders-in-keras.html) written by *F. Chollet* on [autoencoders](https://en.wikipedia.org/wiki/Autoencoder)."
]
},
{
"cell_type": "markdown",
"id": "06d935a8-5cb4-466b-ad97-aedf12265338",
"metadata": {},
"source": [
"## Convolutional Autoencoder\n",
"Since our inputs are images, it makes sense to use convolutional neural networks (convnets) as encoders and decoders. In practical settings, autoencoders applied to images are always convolutional autoencoders --they simply perform much better.\n",
"\n",
"Let's implement one. The encoder will consist in a stack of Conv2D and MaxPooling2D layers (max pooling being used for spatial down-sampling), while the decoder will consist in a stack of Conv2D and UpSampling2D layers."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5f302f4-45bc-482c-8016-63a0e698d6f6",
"metadata": {},
"outputs": [],
"source": [
"# get initial libs\n",
"import keras\n",
"from keras import layers\n",
"from keras.datasets import mnist\n",
"import numpy as np\n",
"import visualization.image"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e743e762-1c41-4b32-b6db-a63db7c25c7c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"input_img = keras.Input(shape=(28, 28, 1))\n",
"\n",
"x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)\n",
"x = layers.MaxPooling2D((2, 2), padding='same')(x)\n",
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n",
"x = layers.MaxPooling2D((2, 2), padding='same')(x)\n",
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n",
"encoded = layers.MaxPooling2D((2, 2), padding='same')(x)\n",
"\n",
"# at this point the representation is (4, 4, 8) i.e. 128-dimensional\n",
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)\n",
"x = layers.UpSampling2D((2, 2))(x)\n",
"x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)\n",
"x = layers.UpSampling2D((2, 2))(x)\n",
"x = layers.Conv2D(16, (3, 3), activation='relu')(x)\n",
"x = layers.UpSampling2D((2, 2))(x)\n",
"decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)\n",
"\n",
"autoencoder = keras.Model(input_img, decoded)\n",
"autoencoder.compile(optimizer='adam', loss='binary_crossentropy')"
]
},
{
"cell_type": "markdown",
"id": "58b56c97-6532-41a2-96bc-a8f589b20a84",
"metadata": {},
"source": [
"To train it, we will use the original MNIST digits with shape (samples, 3, 28, 28), and we will just normalize pixel values between 0 and 1."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "96970532-a0b4-40f7-8225-a770bff0d42b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"(x_train, _), (x_test, _) = mnist.load_data()\n",
"\n",
"x_train = x_train.astype('float32') / 255.\n",
"x_test = x_test.astype('float32') / 255.\n",
"x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))\n",
"x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2c10a00e-73ad-469e-869b-85f902d09533",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"autoencoder.fit(x_train, x_train,\n",
" epochs=50,\n",
" batch_size=128,\n",
" shuffle=True,\n",
" validation_data=(x_test, x_test))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b7e1872-ad71-4d3b-bbd9-b9b6efb09982",
"metadata": {},
"outputs": [],
"source": [
"visualization.image.compare_results(x_test, autoencoder.predict(x_test));"
]
},
{
"cell_type": "markdown",
"id": "a1955fa8-c2d9-42dc-98b8-84e82616dc9f",
"metadata": {},
"source": [
"The model converges to a loss of 0.094, significantly better than our previous models (this is in large part due to the higher entropic capacity of the encoded representation, 128 dimensions vs. 32 previously). "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment