Create a gist now

Instantly share code, notes, and snippets.

Embed
Notes on Fractional Max-Pooling

Fractional Max-Pooling (FMP)

Introduction

  • Link to Paper
  • Spatial pooling layers are building blocks for Convolutional Neural Networks (CNNs).
  • Input to pooling operation is a Nin x Nin matrix and output is a smaller matrix Nout x Nout.
  • Pooling operation divides Nin x Nin square into N2out pooling regions Pi, j.
  • Pi, j ⊂ {1, 2, . . . , Nin} ∀ (i, j) ∈ {1, . . . , Nout}2

MP2

  • Refers to 2x2 max-pooling layer.
  • Popular choice for max-pooling operation.

Advantages of MP2

  • Fast.
  • Quickly reduces the size of the hidden layer.
  • Encodes a degree of invariance with respect to translations and elastic distortions.

Issues with MP2

  • Disjoint nature of pooling regions.
  • Since size decreases rapidly, stacks of back-to-back CNNs are needed to build deep networks.

FMP

  • Reduces the spatial size of the image by a factor of α, where α ∈ (1, 2).
  • Introduces randomness in terms of choice of pooling region.
  • Pooling regions can be chosen in a random or pseudorandom manner.
  • Pooling regions can be disjoint or overlapping.

Generating Pooling Regions

  • Let ai and bi be 2 increasing sequences of integers, starting at 1 and ending at Nin.
  • Increments are either 1 or 2.
  • For disjoint regions, P = [ai−1, ai − 1] × [bj−1, bj − 1]
  • For overlapping regions, P = [ai−1, ai] × [bj−1, bj1]
  • Pooling regions can be generated randomly by choosing the increment randomly at each step.
  • To generate pooling regions in a peusdorandom manner, choose ai = ceil(α*(i+u)), where α ∈ (1, 2) with some u ∈ (0, 1).
  • Each FMP layer uses a different pair of sequence.
  • An FMP network can be thought of as an ensemble of similar networks, with each different pooling-region configuration defining a different member of the ensemble.

Observations

  • Random FMP is good on its own but may underfit when combined with dropout or training data augmentation.
  • Pseudorandom approach generates more stable pooling regions.
  • Overlapping FMP performs better than disjoint FMP.

Weakness

  • No justification is provided for the observations mentioned above.
  • It needs to be seen how performance is affected if the pooling layer in architectures like GoogLeNet.
@Ashu9808

This comment has been minimized.

Show comment
Hide comment
@Ashu9808

Ashu9808 Jun 8, 2018

Can u explain with an example how it generates different pooling regions in a pseudorandom manner?

Ashu9808 commented Jun 8, 2018

Can u explain with an example how it generates different pooling regions in a pseudorandom manner?

@shagunsodhani

This comment has been minimized.

Show comment
Hide comment
@shagunsodhani

shagunsodhani Jun 9, 2018

To take an example, suppose you have a 100x100 grid. You start at the coordinates (0, 0) and maintain two arrays, one for each dimension of the original grid. The values in these arrays would define our pooling regions. Since the increments can either be 1 or 2, the problem is how to generate each of these arrays (basically how to select between the two possible values of increment for each array).

One approach is where we toss a coin every time and use the result of the coin to generate a random sequence of 1's and 2's. This sequence becomes is used to set the values of the two arrays. Since the sequence itself was generated randomly, we call it the random case.

In the Pseduo-random case, we generate the elements (for the arrays mentioned in the first paragraph) in a pseudo-random manner by using the equation:
ith element of array = ceiling(α(i + u)) where α ∈ (1, 2), with some u ∈ (0, 1). Note that α is same as the pooling factor. Also, in this case, we directly generate the elements for the array and not for the sequence.

Hope that helps.

Owner

shagunsodhani commented Jun 9, 2018

To take an example, suppose you have a 100x100 grid. You start at the coordinates (0, 0) and maintain two arrays, one for each dimension of the original grid. The values in these arrays would define our pooling regions. Since the increments can either be 1 or 2, the problem is how to generate each of these arrays (basically how to select between the two possible values of increment for each array).

One approach is where we toss a coin every time and use the result of the coin to generate a random sequence of 1's and 2's. This sequence becomes is used to set the values of the two arrays. Since the sequence itself was generated randomly, we call it the random case.

In the Pseduo-random case, we generate the elements (for the arrays mentioned in the first paragraph) in a pseudo-random manner by using the equation:
ith element of array = ceiling(α(i + u)) where α ∈ (1, 2), with some u ∈ (0, 1). Note that α is same as the pooling factor. Also, in this case, we directly generate the elements for the array and not for the sequence.

Hope that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment