Skip to content

Instantly share code, notes, and snippets.

@YimianDai
Last active April 8, 2021 08:47
Show Gist options
  • Save YimianDai/7668660c07dd6a392a0b15d42d25efbc to your computer and use it in GitHub Desktop.
Save YimianDai/7668660c07dd6a392a0b15d42d25efbc to your computer and use it in GitHub Desktop.
Enhancement of SSD by concatenating feature maps for object detection

Enhancement of SSD by concatenating feature maps for object detection

SSD 的缺点:

  1. each layer in the feature pyramid is used independently 由此导致了 the same object can be detected in multiple scales
    • 具体说明: a certain position of a feature map in a lower layer (say, Conv4-3) is activated. This information can affect entire scales up to the the last layer (Conv11-2), which means that the relevant positions in the higher layers have a good chance to be also activated 但是 SSD 目前没有这种约束
  2. small objects are not detected well

相应的该怎么改善 SSD

  1. 针对缺点 1: considering the relationship between layers in the feature pyramid
  2. 针对缺点 2: 就是提升特征的 abstract representation 能力, 作者用的是增加 Feature pyramid 中的 channel 数 (Feature pyramid, 不是 base network, 具体两者的差别见下面这张 SSD 的结构图)

Jeong17EnhancementSSD-Fig-1

  1. 通过 pooling: feature maps in the lower layers are concatenated to those of the upper layers through pooling 这是最简单易懂的, 就是把低层的特征 pooling 后直接与 高层的特征堆在一起
  2. 通过 deconvolution: 文章中说的是 concatenating the feature maps of the upper layers to the lower layer features through deconvolution or upsampling, 我觉得这个应该就跟 DSSD 差不多的结构, 只不过 DSSD 是 element-wise sum 或者 element-wise production, 而这里应该是 concat, 这也就是 U-Net 吧.

Jeong17EnhancementSSD-Fig-3-2

问题: 作者怎么保证 本文方法不会出现不同 layer 不预测同一目标的问题

Jeong17EnhancementSSD-Fig-3-1

本文提出的 Rainbow concatenation 就是 pooling and deconvolution are performed simultaneously to create feature maps with an explicit relationship between different layers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment