YimianDai/SSDAnchorGenerator.md

## SSDAnchorGenerator.md

      
    Raw
  

              SSDAnchorGenerator.md
            
          
    顾名思义，代码是 Bounding box anchor generator。Anchor 是什么，是平铺过去的 box，不管里面有没有目标。box 的表示方式为 [cx, cy, w, h]。如果知道 Anchor sizes 和 ratio，对于最左上角像素的 Anchor 闭着眼睛也知道肯定是 [11, 11, 21, 21] 这样的，唯一不确定的是要在哪里（最右下角 anchor 的 cx，cy）截止，而这个是要看进来的 features 大小确定的。
这里代码的思路是先生成一个足够大的 Anchor Map （128 x 128），假设所有 feature map 都小于 128 x 128，只要从预设的 Anchor Map 左上角开始 截取 feature map 大小的 Anchor Map 就可以了，这就是 SSDAnchorGenerator 这个函数的思路。
__init__

__init__ 的下面代码负责生成 128 x 128 大小的预设的 Anchor Map
        anchors = self._generate_anchors(self._sizes, self._ratios, step, alloc_size, offsets)
_generate_anchors

    def _generate_anchors(self, sizes, ratios, step, alloc_size, offsets):
        """Generate anchors for once. Anchors are stored with (center_x, center_y, w, h) format."""
        assert len(sizes) == 2, "SSD requires sizes to be (size_min, size_max)"
        anchors = []
        for i in range(alloc_size[0]):
            for j in range(alloc_size[1]):
                cy = (i + offsets[0]) * step
                cx = (j + offsets[1]) * step
                # ratio = ratios[0], size = size_min or sqrt(size_min * size_max)
                r = ratios[0]
                anchors.append([cx, cy, sizes[0], sizes[0]])
                anchors.append([cx, cy, sizes[1], sizes[1]])
                # size = sizes[0], ratio = ...
                for r in ratios[1:]:
                    sr = np.sqrt(r)
                    w = sizes[0] * sr
                    h = sizes[0] / sr
                    anchors.append([cx, cy, w, h])
        return np.array(anchors).reshape(1, 1, alloc_size[0], alloc_size[1], -1)

anchors 是 List of List, 个数一共是 alloc_size[0] * alloc_size[1] * num_depth 个 [cx, cy, w, h] 这样的 List
np.array(anchors) 则是一个 (alloc_size[0] * alloc_size[1] * num_depth) x 4 这样的 NumPy.NDArray
np.array(anchors).reshape(1, 1, alloc_size[0], alloc_size[1], -1) 后得到的是 (1, 1, alloc_size[0], alloc_size[1], num_depth x 4) 这样的 NumPy.NDArray

num_depth

    @property
    def num_depth(self):
        """Number of anchors at each pixel."""
        return len(self._sizes) + len(self._ratios) - 1

feature map 上的每个点对应的 Anchor 数量

hybrid_forward

    def hybrid_forward(self, F, x, anchors):

输入 x 是一个 (B, C, H, W) 的 MXNet.NDArray
anchors 是一个 (1, 1, alloc_size[0], alloc_size[1], num_depth x 4) 这样的 NumPy.NDArray

hybrid_forward 中的下面代码
        a = F.slice_like(anchors, x * 0, axes=(2, 3))

负责从预设的  (1, 1, alloc_size[0], alloc_size[1], num_depth x 4) 的 Anchor Map 中截取跟 feature 大小一致的 Anchor Map，所以 axes 是 2 和 3
得到的 a 是 (1, 1, H, W, num_depth x 4) 的 MXNet.NDArray

        a = a.reshape((1, -1, 4))

a 是 (1, H x W x num_depth, 4) 的 MXNet.NDArray

        if self._clip:
            cx, cy, cw, ch = a.split(axis=-1, num_outputs=4)
            H, W = self._im_size
            a = F.concat(*[cx.clip(0, W), cy.clip(0, H), cw.clip(0, W), ch.clip(0, H)], dim=-1)

又因为默认的 clip=False，因此下面这段代码根本不会运行，传入的 im_size 也就是在 Train / Eval Script 中的 data-shape，也就是 get_ssd 中的 base_size 在这里根本不起作用，返回的 anchor map 的大小只是根据传入的 feature 大小来，因此在创建 SSD 网络的 get_ssd 函数中，base_size 这个参数是没有用的，不需要在意。