細粒度オープンワールド分類のための特異性認識強化学習

要旨

オープンワールド設定、すなわち事前定義されたラベルセットなしで、細粒度の視覚概念を分類するには、モデルが正確かつ具体的であることが求められる。近年の推論機能を持つ大規模マルチモーダルモデル（LMM）は強力な視覚理解能力を示すが、細粒度画像分類を実行する際に過度に汎用的な予測を生成する傾向がある。我々の予備分析により、モデルは本来、細粒度の領域知識を有していることが明らかとなった。しかし、正しい予測（正確性）を損なうことなく、より具体的な予測（具体性）を促進することは、依然として重要な課題であり、十分に研究されていない。本研究では、推論LMMを正しくかつ具体的な予測に向けて誘導する方法を探る。我々は、オープンワールド設定下での細粒度画像分類に対して推論LMMをファインチューニングするため、具体性を考慮した新しい強化学習フレームワーク、SpeciaRLを提案する。SpeciaRLは、オンラインロールアウト内での最良の予測に基づく、検証器を利用した動的な報酬信号を導入し、誤った予測を防ぐためにモデルの能力を尊重しながら具体性を促進する。ドメイン外実験の結果、SpeciaRLは広範な細粒度ベンチマークにおいて、正確性と具体性のバランスで既存手法を凌駕し、オープンワールド細粒度画像分類を前進させることを示した。コードとモデルはhttps://github.com/s-angheben/SpeciaRLで公開されている。

English

Classifying fine-grained visual concepts under open-world settings, i.e., without a predefined label set, demands models to be both accurate and specific. Recent reasoning Large Multimodal Models (LMMs) exhibit strong visual understanding capability but tend to produce overly generic predictions when performing fine-grained image classification. Our preliminary analysis reveals that models do possess the intrinsic fine-grained domain knowledge. However, promoting more specific predictions (specificity) without compromising correct ones (correctness) remains a non-trivial and understudied challenge. In this work, we investigate how to steer reasoning LMMs toward predictions that are both correct and specific. We propose a novel specificity-aware reinforcement learning framework, SpeciaRL, to fine-tune reasoning LMMs on fine-grained image classification under the open-world setting. SpeciaRL introduces a dynamic, verifier-based reward signal anchored to the best predictions within online rollouts, promoting specificity while respecting the model's capabilities to prevent incorrect predictions. Our out-of-domain experiments show that SpeciaRL delivers the best trade-off between correctness and specificity across extensive fine-grained benchmarks, surpassing existing methods and advancing open-world fine-grained image classification. Code and model are publicly available at https://github.com/s-angheben/SpeciaRL.

細粒度オープンワールド分類のための特異性認識強化学習

Specificity-aware reinforcement learning for fine-grained open-world classification

要旨

Support