白川アライメント技術レポート

要旨

Baichuan Alignment（白川アラインメント）を紹介します。これは、Baichuanシリーズのモデルで使用されるアラインメント手法の詳細な分析です。これは、産業界初の包括的なアラインメント手法の説明であり、AI研究の推進に貴重な示唆を提供します。アラインメントプロセス中にモデルの性能を向上させるための重要な要素を調査し、最適化手法、データ戦略、機能強化、および評価プロセスを含みます。このプロセスは、Prompt Augmentation System（PAS）、Supervised Fine-Tuning（SFT）、およびPreference Alignmentの3つの主要段階にまたがります。遭遇した問題、適用された解決策、および行われた改善が詳細に記録されています。よく知られたベンチマークを通じた比較を通じて、Baichuan Alignmentによって実現された技術革新を強調します。Baichuan-Instructは内部モデルであり、Qwen2-72BおよびLlama-3-70BのベースモデルのinstructバージョンであるQwen2-Nova-72BおよびLlama3-PBM-Nova-70Bは、Baichuan Alignmentを通じて最適化されています。Baichuan-Instructは、コア機能で著しい改善を示し、ユーザーエクスペリエンスの向上率は17％から28％に及び、特化したベンチマークで優れた性能を発揮します。オープンソースのベンチマーク評価では、Qwen2-Nova-72BおよびLlama3-PBM-Nova-70Bは、ほぼすべてのデータセットで、それぞれの公式のinstructバージョンを一貫して上回っています。このレポートは、アラインメントプロセスの背後にある主要な技術を明確にし、コミュニティ内でより深い理解を促進することを目的としています。 Llama3-PBM-Nova-70Bモデルは、https://huggingface.co/PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B で入手可能です。

English

We introduce Baichuan Alignment, a detailed analysis of the alignment techniques employed in the Baichuan series of models. This represents the industry's first comprehensive account of alignment methodologies, offering valuable insights for advancing AI research. We investigate the critical components that enhance model performance during the alignment process, including optimization methods, data strategies, capability enhancements, and evaluation processes. The process spans three key stages: Prompt Augmentation System (PAS), Supervised Fine-Tuning (SFT), and Preference Alignment. The problems encountered, the solutions applied, and the improvements made are thoroughly recorded. Through comparisons across well-established benchmarks, we highlight the technological advancements enabled by Baichuan Alignment. Baichuan-Instruct is an internal model, while Qwen2-Nova-72B and Llama3-PBM-Nova-70B are instruct versions of the Qwen2-72B and Llama-3-70B base models, optimized through Baichuan Alignment. Baichuan-Instruct demonstrates significant improvements in core capabilities, with user experience gains ranging from 17% to 28%, and performs exceptionally well on specialized benchmarks. In open-source benchmark evaluations, both Qwen2-Nova-72B and Llama3-PBM-Nova-70B consistently outperform their respective official instruct versions across nearly all datasets. This report aims to clarify the key technologies behind the alignment process, fostering a deeper understanding within the community. Llama3-PBM-Nova-70B model is available at https://huggingface.co/PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B.

白川アライメント技術レポート

Baichuan Alignment Technical Report

要旨

Support