객체 탐지를 위한 작업 특화적 제로샷 양자화 인지 학습

초록

양자화(Quantization)는 네트워크 파라미터를 더 낮은 정밀도로 표현함으로써 네트워크 크기와 계산 복잡성을 줄이는 핵심 기술입니다. 전통적인 양자화 방법은 원본 학습 데이터에 대한 접근에 의존하는데, 이는 프라이버시 문제나 보안 문제로 인해 종종 제한됩니다. 제로샷 양자화(Zero-shot Quantization, ZSQ)는 사전 훈련된 모델에서 생성된 합성 데이터를 사용하여 실제 학습 데이터의 필요성을 없애는 방식으로 이 문제를 해결합니다. 최근에는 ZSQ가 객체 탐지 분야로 확장되었습니다. 그러나 기존 방법들은 객체 탐지에 필요한 특정 정보가 부족한 레이블이 없는 작업 독립적(task-agnostic) 합성 이미지를 사용하여 최적의 성능을 달성하지 못하는 문제가 있습니다. 본 논문에서는 객체 탐지 네트워크를 위한 새로운 작업 특화적(task-specific) ZSQ 프레임워크를 제안합니다. 이 프레임워크는 두 가지 주요 단계로 구성됩니다. 첫째, 사전 훈련된 네트워크에서 작업 특화적 캘리브레이션 세트를 합성하기 위해 바운딩 박스와 카테고리 샘플링 전략을 도입하여 사전 지식 없이도 객체 위치, 크기, 카테고리 분포를 재구성합니다. 둘째, 지식 증류(knowledge distillation) 과정에 작업 특화적 훈련을 통합하여 양자화된 탐지 네트워크의 성능을 복원합니다. MS-COCO 및 Pascal VOC 데이터셋에서 수행된 광범위한 실험을 통해 우리 방법의 효율성과 최첨단 성능을 입증했습니다. 우리의 코드는 https://github.com/DFQ-Dojo/dfq-toolkit 에서 공개되어 있습니다.

English

Quantization is a key technique to reduce network size and computational complexity by representing the network parameters with a lower precision. Traditional quantization methods rely on access to original training data, which is often restricted due to privacy concerns or security challenges. Zero-shot Quantization (ZSQ) addresses this by using synthetic data generated from pre-trained models, eliminating the need for real training data. Recently, ZSQ has been extended to object detection. However, existing methods use unlabeled task-agnostic synthetic images that lack the specific information required for object detection, leading to suboptimal performance. In this paper, we propose a novel task-specific ZSQ framework for object detection networks, which consists of two main stages. First, we introduce a bounding box and category sampling strategy to synthesize a task-specific calibration set from the pre-trained network, reconstructing object locations, sizes, and category distributions without any prior knowledge. Second, we integrate task-specific training into the knowledge distillation process to restore the performance of quantized detection networks. Extensive experiments conducted on the MS-COCO and Pascal VOC datasets demonstrate the efficiency and state-of-the-art performance of our method. Our code is publicly available at: https://github.com/DFQ-Dojo/dfq-toolkit .

객체 탐지를 위한 작업 특화적 제로샷 양자화 인지 학습

Task-Specific Zero-shot Quantization-Aware Training for Object Detection

초록

Support