PCoreSet: 비전-언어 모델의 지식 증류를 통한 효과적인 능동 학습

초록

지식 증류(Knowledge Distillation, KD)는 교사 모델의 지식을 활용하여 컴팩트하고 작업 특화된 모델을 훈련시키기 위해 널리 사용되는 프레임워크입니다. 그러나 주석 비용을 최소화하기 위해 반복적인 샘플 선택을 목표로 하는 능동 학습(Active Learning, AL)에의 적용은 아직 충분히 탐구되지 않았습니다. 이러한 격차는 KD가 일반적으로 충분한 레이블 데이터에 접근할 수 있다고 가정하는 반면, AL은 작업 특화된 교사 모델이 종종 부재한 데이터가 부족한 시나리오에서 작동하기 때문입니다. 본 논문에서는 대규모 시각-언어 모델(Vision-Language Models, VLMs)의 제로샷 및 퓨샷 능력을 활용하여 AL과 KD를 통합한 ActiveKD 프레임워크를 소개합니다. ActiveKD의 핵심 요소는 VLMs의 구조화된 예측 편향, 즉 그들의 예측이 확률 공간에서 클러스터를 형성하는 특성입니다. 우리는 이 구조를 교사 모델의 귀납적 편향으로 간주하며, 이는 학생 모델의 학습에 유익한 일반화 가능한 출력 패턴을 포착합니다. 이 편향을 활용하기 위해, 우리는 확률 공간에서의 커버리지를 극대화하는 선택 전략인 확률적 코어셋(Probabilistic CoreSet, PCoreSet)을 제안합니다. PCoreSet은 범주적으로 다양한 레이블 없는 샘플을 전략적으로 선택함으로써 제한된 주석 예산 하에서 교사 지식의 더 효율적인 전달을 가능하게 합니다. 11개의 데이터셋에 대한 평가 결과, PCoreSet은 ActiveKD 프레임워크 내에서 기존 선택 방법들을 지속적으로 능가하며, AL과 KD의 교차점에서의 연구를 진전시킵니다.

English

Knowledge distillation (KD) is a widely used framework for training compact, task-specific models by leveraging the knowledge of teacher models. However, its application to active learning (AL), which aims to minimize annotation costs through iterative sample selection, remains underexplored. This gap stems from the fact that KD typically assumes access to sufficient labeled data, whereas AL operates in data-scarce scenarios where task-specific teacher models are often unavailable. In this paper, we introduce ActiveKD, a framework that integrates AL with KD by leveraging the zero- and few-shot capabilities of large vision-language models (VLMs). A key aspect of ActiveKD is the structured prediction bias of VLMs -- i.e., their predictions form clusters in the probability space. We regard this structure as an inductive bias of the teacher model, capturing generalizable output patterns beneficial to student learning. To exploit this bias, we propose Probabilistic CoreSet (PCoreSet), a selection strategy that maximizes coverage in the probability space rather than the feature space. PCoreSet strategically selects categorically diverse unlabeled samples, facilitating more efficient transfer of teacher knowledge under limited annotation budgets. Evaluations on 11 datasets show that PCoreSet consistently outperforms existing selection methods within the ActiveKD framework, advancing research at the intersection of AL and KD.

PCoreSet: 비전-언어 모델의 지식 증류를 통한 효과적인 능동 학습

PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language Models

초록

Support