本文主要是介绍PaddleDetection算法分析(8),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
2021SC@SDUSC
接上文 继续torchvision Faster-RCNN ResNet-50 FPN的分析
FPN
FPN,即Feature Pyramid Networks,是一种多尺寸,金字塔结构深度学习网络,使用了FPN的Faster-RCNN,其测试结果超过大部分single-model,包括COCO 2016年挑战的获胜模型。其优势是对小尺寸对象的检测。
FPN代码解读
torchvision中包含了ResNet50 FPN完整的源代码(这里参考的是torchvision 0.7.0里面的代码),这里就解读一下对应的实现,为了解释流畅,尽量采用ResNet-50中的layer name,以及对应的参数:
FPN结构:
(fpn): FeaturePyramidNetwork((inner_blocks): ModuleList((0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))(1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))(2): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))(3): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1)))(layer_blocks): ModuleList((0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(extra_blocks): LastLevelMaxPool())
FPN处理数据的代码看看如下代码,就能知道对应的流程:
class FeaturePyramidNetwork(nn.Module):......def forward(self, x):# type: (Dict[str, Tensor]) -> Dict[str, Tensor]"""Computes the FPN for a set of feature maps.Arguments:x (OrderedDict[Tensor]): feature maps for each feature level.Returns:results (OrderedDict[Tensor]): feature maps after FPN layers.They are ordered from highest resolution first."""# unpack OrderedDict into two lists for easier handlingnames = list(x.keys())x = list(x.values())last_inner = self.get_result_from_inner_blocks(x[-1], -1)results = []results.append(self.get_result_from_layer_blocks(last_inner, -1))for idx in range(len(x) - 2, -1, -1):inner_lateral = self.get_result_from_inner_blocks(x[idx], idx)feat_shape = inner_lateral.shape[-2:]inner_top_down = F.interpolate(last_inner, size=feat_shape, mode="nearest")last_inner = inner_lateral + inner_top_downresults.insert(0, self.get_result_from_layer_blocks(last_inner, idx))if self.extra_blocks is not None:results, names = self.extra_blocks(results, x, names)# make it back an OrderedDictout = OrderedDict([(k, v) for k, v in zip(names, results)])return out
这里要指出来的是,如何在pytorch中实现2x up:
F.interpolate(last_inner, size=feat_shape, mode="nearest")
这里feat_shape就是2x up之后的shape.
另外一个需要指出的是results,就是存放了每层layer_block_conv的输出,然后送入RPN网络进行背景前景二分类和Bounding-Box回归,在top层支持检测出大的object,越往下越小的对象将被检测出来。
下面是整理的全局图 可以很好地理解整体结构
这里左边对应的是layer name,比如conv5_x,这是和ResNet表中layer name可以对应起来。左边的部分称为Bottom-up pathway,右边称为Top-down pathway,ResNet从conv2_x~conv5_x,每层的输出都会输出一份到右边的pathway,这里称之为lateral connections,总的来说可以用下面公式表示表示FPN:
FPN=Top-downpathway+laterlconnections
接下来是对另一部分讲解
这篇关于PaddleDetection算法分析(8)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!