Skip to content

Instantly share code, notes, and snippets.

@shagunsodhani
Created November 22, 2016 17:07
Show Gist options
  • Save shagunsodhani/aa79796c70565e3761e86d0f932a3de5 to your computer and use it in GitHub Desktop.
Save shagunsodhani/aa79796c70565e3761e86d0f932a3de5 to your computer and use it in GitHub Desktop.
Notes for DCGAN paper

Deep Convolutional Generative Adversarial Nets

Introduction

  • The paper presents Deep Convolutional Generative Adversarial Nets (DCGAN) - a topologically constrained variant of conditional GAN.
  • Link to the paper

Benefits

  • Stable to train
  • Very useful to learn unsupervised image representations.

Model

  • GANs difficult to scale using CNNs.
  • Paper proposes following changes to GANs:
    • Replace any pooling layers with strided convolutions (for discriminator) and fractional strided convolutions (for generators).
    • Remove fully connected hidden layers.
    • Use batch normalisation in both generator (all layers except output layer) and discriminator (all layers except input layer).
    • Use LeakyReLU in all layers of the discriminator.
    • Use ReLU activation in all layers of the generator (except output layer which uses Tanh).

Datasets

  • Large-Scale Scene Understanding.
  • Imagenet-1K.
  • Faces dataset.

Hyperparameters

  • Minibatch SGD with minibatch size of 128.
  • Weights initialized with 0 centered Normal distribution with standard deviation = 0.02
  • Adam Optimizer
  • Slope of leak = 0.2 for LeakyReLU.
  • Learning rate = 0.0002, β1 = 0.5

Observations

  • Large-Scale Scene Understanding data
    • Demonstrates that model scales with more data and higher resolution generation.
    • Even though it is unlikely that model would have memorized images (due to low learning rate of minibatch SGD).
  • Classifying CIFAR-10 dataset
    • Features
      • Train in Imagenet-1K and test on CIFAR-10.
      • Max pool discriminator's convolutional features (from all layers) to get 4x4 spatial grids.
      • Flatten and concatenate to get a 28672-dimensional vector.
      • Linear L2-SVM classifier trained over the feature vector.
    • 82.8% accuracy, outperforms K-means (80.6%)
  • Street View House Number Classifier
    • Similar pipeline as CIFAR-10
    • 22.48% test error.
  • The paper contains many examples of images generated by final and intermediate layers of the network.
  • Images in the latent space do not show sharp transitions indicating that network did not memorize images.
  • DCGAN can learn an interesting hierarchy of features.
  • Networks seems to have some success in disentangling image representation from object representation.
  • Vector arithmetic can be performed on the Z vectors corresponding to the face samples to get results like smiling woman - normal woman + normal man = smiling man visually.
@2017develper
Copy link

i want to find a tutorial with gan and unsupervised learning in python please can you help me

@vishal5212
Copy link

no

@HungUnicorn
Copy link

HungUnicorn commented Jun 13, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment