ChatPaper.aiChatPaper

少即是多:DETR 的高效关注力聚焦

Less is More: Focus Attention for Efficient DETR

July 24, 2023
作者: Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang
cs.AI

摘要

类似DETR的模型显著提升了检测器的性能,甚至超过了传统的卷积模型。然而,在传统的编码器结构中,所有标记都被平等对待,没有区分,这会带来冗余的计算负担。最近的稀疏化策略利用一部分信息丰富的标记来减少注意力复杂度,通过稀疏编码器保持性能。但这些方法往往依赖于不可靠的模型统计。此外,简单地减少标记数量会严重阻碍检测性能,限制了这些稀疏模型的应用。我们提出了Focus-DETR,它专注于更具信息量的标记,以更好地权衡计算效率和模型准确性。具体来说,我们重新构建了具有双重注意力的编码器,其中包括一个标记评分机制,考虑了来自多尺度特征图的对象的定位和类别语义信息。我们有效地放弃了背景查询,并基于分数增强了细粒度对象查询的语义交互。与相同设置下的最先进稀疏DETR-like检测器相比,我们的Focus-DETR在COCO上实现了50.4AP(+2.2)的可比复杂度。代码可在以下链接找到:https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR 和 https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR。
English
DETR-like models have significantly boosted the performance of detectors and even outperformed classical convolutional models. However, all tokens are treated equally without discrimination brings a redundant computational burden in the traditional encoder structure. The recent sparsification strategies exploit a subset of informative tokens to reduce attention complexity maintaining performance through the sparse encoder. But these methods tend to rely on unreliable model statistics. Moreover, simply reducing the token population hinders the detection performance to a large extent, limiting the application of these sparse models. We propose Focus-DETR, which focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Specifically, we reconstruct the encoder with dual attention, which includes a token scoring mechanism that considers both localization and category semantic information of the objects from multi-scale feature maps. We efficiently abandon the background queries and enhance the semantic interaction of the fine-grained object queries based on the scores. Compared with the state-of-the-art sparse DETR-like detectors under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO. The code is available at https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR and https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR.
PDF70December 15, 2024