PCoreSet：ビジョン言語モデルからの知識蒸留による効果的なアクティブラーニング

要旨

知識蒸留（KD）は、教師モデルの知識を活用してコンパクトでタスク特化型のモデルを訓練するために広く用いられるフレームワークです。しかし、アノテーションコストを最小化するために反復的なサンプル選択を行う能動学習（AL）への応用は、まだ十分に検討されていません。このギャップは、KDが通常十分なラベル付きデータへのアクセスを前提としているのに対し、ALはデータが限られた状況で動作し、タスク特化型の教師モデルが利用できないことが多いという事実に起因しています。本論文では、大規模な視覚言語モデル（VLM）のゼロショットおよび少数ショット能力を活用してALとKDを統合するActiveKDフレームワークを紹介します。ActiveKDの重要な側面は、VLMの構造化された予測バイアス、すなわち、その予測が確率空間でクラスターを形成するという特性です。この構造を教師モデルの帰納的バイアスと見なし、学生モデルの学習に有益な一般化可能な出力パターンを捉えるものとします。このバイアスを活用するために、特徴空間ではなく確率空間でのカバレッジを最大化する選択戦略であるProbabilistic CoreSet（PCoreSet）を提案します。PCoreSetは、カテゴリ的に多様な未ラベルサンプルを戦略的に選択し、限られたアノテーションバジェットの下で教師の知識をより効率的に転移させます。11のデータセットでの評価により、PCoreSetがActiveKDフレームワーク内で既存の選択手法を一貫して上回り、ALとKDの交差点における研究を進展させることが示されました。

English

Knowledge distillation (KD) is a widely used framework for training compact, task-specific models by leveraging the knowledge of teacher models. However, its application to active learning (AL), which aims to minimize annotation costs through iterative sample selection, remains underexplored. This gap stems from the fact that KD typically assumes access to sufficient labeled data, whereas AL operates in data-scarce scenarios where task-specific teacher models are often unavailable. In this paper, we introduce ActiveKD, a framework that integrates AL with KD by leveraging the zero- and few-shot capabilities of large vision-language models (VLMs). A key aspect of ActiveKD is the structured prediction bias of VLMs -- i.e., their predictions form clusters in the probability space. We regard this structure as an inductive bias of the teacher model, capturing generalizable output patterns beneficial to student learning. To exploit this bias, we propose Probabilistic CoreSet (PCoreSet), a selection strategy that maximizes coverage in the probability space rather than the feature space. PCoreSet strategically selects categorically diverse unlabeled samples, facilitating more efficient transfer of teacher knowledge under limited annotation budgets. Evaluations on 11 datasets show that PCoreSet consistently outperforms existing selection methods within the ActiveKD framework, advancing research at the intersection of AL and KD.

PCoreSet：ビジョン言語モデルからの知識蒸留による効果的なアクティブラーニング

PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language Models

要旨

Support