少即是多:專注注意力以提高DETR效率
Less is More: Focus Attention for Efficient DETR
July 24, 2023
作者: Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang
cs.AI
摘要
DETR-like 模型顯著提升了檢測器的性能,甚至優於傳統的卷積模型。然而,在傳統的編碼器結構中,對所有 token 一視同仁的處理帶來了冗余的計算負擔。最近的稀疏化策略利用一部分資訊豐富的 token 來降低注意力複雜度,通過稀疏編碼器保持性能。但這些方法往往依賴於不可靠的模型統計。此外,僅僅減少 token 數量會嚴重阻礙檢測性能,限制了這些稀疏模型的應用。我們提出了 Focus-DETR,它專注於更具信息量的 token,以在計算效率和模型準確性之間取得更好的平衡。具體來說,我們通過雙重注意力重構編碼器,其中包括一個考慮多尺度特徵圖中對象的定位和類別語義信息的 token 評分機制。我們有效地放棄了背景查詢,並基於分數增強了細粒度對象查詢的語義交互作用。與相同設置下的最先進的稀疏 DETR-like 檢測器相比,我們的 Focus-DETR 在 COCO 上實現了 50.4AP(+2.2)的性能,複雜度相當。代碼可在以下鏈接找到:https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR 和 https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR。
English
DETR-like models have significantly boosted the performance of detectors and
even outperformed classical convolutional models. However, all tokens are
treated equally without discrimination brings a redundant computational burden
in the traditional encoder structure. The recent sparsification strategies
exploit a subset of informative tokens to reduce attention complexity
maintaining performance through the sparse encoder. But these methods tend to
rely on unreliable model statistics. Moreover, simply reducing the token
population hinders the detection performance to a large extent, limiting the
application of these sparse models. We propose Focus-DETR, which focuses
attention on more informative tokens for a better trade-off between computation
efficiency and model accuracy. Specifically, we reconstruct the encoder with
dual attention, which includes a token scoring mechanism that considers both
localization and category semantic information of the objects from multi-scale
feature maps. We efficiently abandon the background queries and enhance the
semantic interaction of the fine-grained object queries based on the scores.
Compared with the state-of-the-art sparse DETR-like detectors under the same
setting, our Focus-DETR gets comparable complexity while achieving 50.4AP
(+2.2) on COCO. The code is available at
https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR and
https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR.