[Retentive R-CNN] Generalized Few-Shot Object Detection without Forgetting(CVPR. 2021)

本文主要是介绍[Retentive R-CNN] Generalized Few-Shot Object Detection without Forgetting(CVPR. 2021)，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

在这里插入图片描述

1. Motivation

本文关注于fine-tune后的FSOD模型会在base classes上性能下降的问题。这篇文章构建了Retentive R-CNN，创新点在于Bias-Balance RPN Re-detector，用来在识别novel classes的同时，不降低原有的base classes的精度

However, the majority focus merely on the performance of few-shot categories and ignore the catastrophic forgetting of base classes, which is not realistic.

Meta-learning方法的缺陷：

由于使用support images，那么如果当support category较多的情况下，那么网络训练的时间复杂度也会增加

As their computational complexity is proportional to the number of categories, these methods become rather slow or even unavailable when tackling both sets of classes of a dataset

本文还划分了目前FSOD的主要方法，可以分为Meta Learning Based以及 Transfer-learning Baed

Meta Learning Based

FSRW、 Meta R-CNN、 Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector、Repmet、 MetaDet
Transfer Learning Based

LSTD、TFA、MPSR、Context-Transformer

Our contributions can be concluded as follows:
- We find properties of base class detectors neglected in few-shot detection literature, which can be utilized to improve both base and novel class performance for transfer learning based methods with little overhead.
- We propose a few-shot detector without forgetting, Retentive R-CNN, with Bias-Balanced RPN and Re-detector to assist novel class adaptation with base class knowledge and ensemble base and novel class detectors.
- Our method achieves state-of-the-art overall performance on the few-shot detection benchmark[41, 17] across all settings, with leading base class metrics and competitive novel class metrics.

在这里插入图片描述

作者在TFA的基础上，做了以下3个实验。

图2（a）所示，通过L2正则化可以发现novel classs 和base class 被区分开来，并且novel class 和base class 相关性强的，L2正则化后的得分也会比较高。

The results are shown in Figure2(a). A massive variation of norms between base classes and unseen novel classes can be easily observed.
Also, the norms of unseen classes with closer relationship with seen classes are relatively higher (blue names annotated in Figure2(a))

答案是否定的，通过图2（c）

the detector is still able to recog- nize it as background.

在这里插入图片描述

本文认为RPN并不是真的无类别的，而是基于对训练的类别有一种偏置。

Re-detector 有2个检测头，分别用于检测base classes 以及 all classes， $det_b$ 是fixed 使用FC层，而 $det_n$ 是fine-tune weights 使用cossine similar scores，

Similar to TFA, we finetune merely the last layers of classification and box regression head of $det_n$

在Bias-Balanced RPN中加入了 unfixed objectness 分支。

We try to unfreeze different layers of RPN for finetuning and empirically, unfreeze the final layer that predicts objectness is sufficient to produce a noticeable improvement.
最后将2个objectness的得分值取max， $O^{H\times W}_n = max(O^{H \times W}_b, O^{H \times W}_n)$ 。

本文unfreeze的层包括了以下3个部分，rpn 的objectness部分（rpn的box-regression部分冻结）， RoI-Head的分类和回归层。

As aforementioned, we only unfreeze three layers: ob- jectness of the finetuned RPN, the last linear layers of clas- sification and box regression of $det^n$

并且相对于TFA在fine-tune阶段训练的