探索、検証、フィードバック：検証エンジニアリングを通じたファウンデーションモデルの次世代の事後トレーニングパラダイムに向けて

要旨

機械学習の進化は、強力なモデルの開発とよりスケーラブルな監督信号の重視がますます進んできました。しかし、基盤モデルの出現により、それらの能力をさらに向上させるために必要な効果的な監督信号を提供することには重大な課題があります。その結果、新しい監督信号や技術的アプローチを探求する緊急性があります。本論文では、基盤モデルの時代に特化した新しい事後トレーニングパラダイムである「検証者エンジニアリング」を提案します。検証者エンジニアリングの核心は、自動検証者のスイートを活用して検証タスクを実行し、基盤モデルに意味のあるフィードバックを提供することにあります。我々は、検証者エンジニアリングプロセスを「探索、検証、フィードバック」という3つの重要な段階に体系的に分類し、各段階での最先端の研究動向について包括的なレビューを提供します。検証者エンジニアリングは、人工一般知能を達成するための基本的な途を構成すると考えています。

English

The evolution of machine learning has increasingly prioritized the development of powerful models and more scalable supervision signals. However, the emergence of foundation models presents significant challenges in providing effective supervision signals necessary for further enhancing their capabilities. Consequently, there is an urgent need to explore novel supervision signals and technical approaches. In this paper, we propose verifier engineering, a novel post-training paradigm specifically designed for the era of foundation models. The core of verifier engineering involves leveraging a suite of automated verifiers to perform verification tasks and deliver meaningful feedback to foundation models. We systematically categorize the verifier engineering process into three essential stages: search, verify, and feedback, and provide a comprehensive review of state-of-the-art research developments within each stage. We believe that verifier engineering constitutes a fundamental pathway toward achieving Artificial General Intelligence.

探索、検証、フィードバック：検証エンジニアリングを通じたファウンデーションモデルの次世代の事後トレーニングパラダイムに向けて

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

要旨

Support