OOD物体検出器は、ファウンデーションモデルから学習できますか？

要旨

未知の分布（OOD）物体検出は、オープンセットのOODデータが存在しないため、困難なタスクです。最近のテキストから画像への生成モデルの進歩に触発され、Stable Diffusionなどの生成モデルの潜在能力を研究し、大規模なオープンセットデータでトレーニングされた生成モデルがOODサンプルを合成する可能性を検討します。これにより、OOD物体検出が向上します。我々はSyncOODを導入します。これは、大規模な基盤モデルの能力を活用して、テキストから画像への生成モデルから意味のあるOODデータを自動的に抽出する単純なデータキュレーション手法です。これにより、モデルは市販の基盤モデルに包括されたオープンワールドの知識にアクセスできます。合成されたOODサンプルは、軽量でプラグアンドプレイのOOD検出器のトレーニングを補完するために使用され、したがって、インディストリビューション（ID）/OODの決定境界を効果的に最適化します。複数のベンチマークを通じた包括的な実験により、SyncOODが既存の手法を大幅に上回り、最小限の合成データ使用量で新たな最先端のパフォーマンスを確立することが示されました。

English

Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data. Inspired by recent advancements in text-to-image generative models, such as Stable Diffusion, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples, thereby enhancing OOD object detection. We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models to automatically extract meaningful OOD data from text-to-image generative models. This offers the model access to open-world knowledge encapsulated within off-the-shelf foundation models. The synthetic OOD samples are then employed to augment the training of a lightweight, plug-and-play OOD detector, thus effectively optimizing the in-distribution (ID)/OOD decision boundaries. Extensive experiments across multiple benchmarks demonstrate that SyncOOD significantly outperforms existing methods, establishing new state-of-the-art performance with minimal synthetic data usage.

OOD物体検出器は、ファウンデーションモデルから学習できますか？

Can OOD Object Detectors Learn from Foundation Models?

要旨

Support