Skip to content

Instantly share code, notes, and snippets.

@cy-xu
Last active December 23, 2019 18:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cy-xu/d98d9f86534aca1878b2e46f6fa9b47f to your computer and use it in GitHub Desktop.
Save cy-xu/d98d9f86534aca1878b2e46f6fa9b47f to your computer and use it in GitHub Desktop.
notes from styleGAN 2 paper

Analyzing and Improving the Image Quality of styleGAN

1. Introduction

  • original StyleGAN was special as it maps input code z to an intermediate latent code w, which applied to AdaIN layers
  • stochastic variation helps the intermediate latent space W to be less entangled
  • this paper investigates and fixes:
    • a. a droplet artifact in original StyleGAN paper via a redesigned norm in generator
    • b. artifacts caused by progressive growing design via a mixture of skip-connection and residual nets
  • FID and P&R are useful but both based on classifier nets that shown to focus on textures rather than shapes
  • PPL metric correlates with consistency and stability of shapes, this is expensive so texecuted less frequently
  • a inversed projection, from image to latent space, helps to identify genrated images

2. Removing normalization artifacts (key idea)

  • the droplet shape artifacts is suspected to be caused by AdaIN which normalizes the mean and variance of each features map separately, potentially destroying information relative to each other
  • 2.2 Instance normalization revisited
    • simply remove normalization makes the style cumulative rather than scale-spacific
    • alternative: base normalization on the expected statistics of incoming feature maps, but without explictly forcing
    • insdie each style block: changing from scaling the incoming feature (AdaIN), to scaling the conv weights, respectively
    • observation is that after modulation and convolution, the output activation is scaled by the L2 norm of corresponding weights. The following normalization could thus be baked into the scaled weights
    • thus the normalization and style block is all baked into one conv opration with scaled weights

4. Progressive growing revisited

  • progressive growing have strong location preference for details
  • excessively high frequencies in the intermediate layers, comprosie shift invariance

4.1 Alternative network architectures

  • drop progressive growing, modify MSG-GAN and LAPGAN
  • MSG-GAN generator ouput a mipmap instead of an image
  • up-sampling (bilinear filtering) and summing the contributions of RGB outputs of each resolutions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment