小型LLMの戦略的調整フレームワークは、データ合成において大型LLMに匹敵する

要旨

データ合成と蒸留は小型言語モデルの性能向上に有望な戦略であるが、現状のアプローチは大規模言語モデル（LLM）に大きく依存しており、高い計算コスト、環境非効率性、そして単一アーキテクチャから継承される潜在的なバイアスといった課題を抱えている。一方で、小型のLLMはよりアクセスしやすく持続可能であるが、個々の能力では高品質で多様かつ信頼性の高いデータを生成するには不十分な場合が多い。人間の協調的プロセス（例：ピアレビュー）に着想を得て、我々は複数の小型LLMを活用するフレームワーク「GRA」を提案する。このフレームワークでは、複数の小型LLMが専門的な役割を分担し、単一の大規模LLMによって達成される反復的な改良と品質管理を実現する。この協調的フレームワークでは、複数の小型LLMが異なる役割——生成者（Generator）、レビュアー（Reviewer）、裁定者（Adjudicator）——を担い、ピアレビューを模倣したデータ合成パイプラインを構築する。生成者が初期データサンプルを提案し、レビュアーがその品質と多様性を批判し、裁定者が矛盾を解決して最終的な出力を決定する。合成プロセスを専門的なサブタスクに分解することで、協調的な小型LLMは大規模LLMベースの蒸留と同等のデータ品質を達成できる。複数のベンチマークを通じた実験により、GRAが生成するデータは単一の大規模LLM（例：Qwen-2.5-72B-Instruct）の出力品質に匹敵またはそれを上回ることを実証した。我々の結果は、高品質なデータ合成において単一の大規模モデルが必要であるという前提に疑問を投げかけ、代わりに小型エージェントの戦略的連携を提唱するものである。我々のデータセット、モデル、コードはhttps://github.com/GX-XinGao/GRAで公開されている。

English

While data synthesis and distillation are promising strategies to enhance small language models, current approaches heavily rely on Large Language Models (LLMs), which suffer from high computational costs, environmental inefficiency, and potential biases inherited from monolithic architectures. In contrast, smaller LLMs are more accessible and sustainable, but their individual capabilities often fall short in generating high-quality, diverse, and reliable data. Inspired by collaborative human processes (e.g., peer review), we propose a multiple small LLMs involved framework, GRA, that aggregates specialized roles across small LLMs to iterative refinement and quality control typically achieved by a single large LLM. In this collaborative framework, multiple small LLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate a peer-review-inspired data synthesis pipeline. The Generator proposes initial data samples, the Reviewer critiques their quality and diversity, and the Adjudicator resolves conflicts to finalize the output. By decomposing the synthesis process into specialized sub-tasks, collaborative small LLMs can achieve data-level parity with large LLM-based distillation. Through experiments across multiple benchmarks, we demonstrate that GRA-produced data matches or exceeds the quality of single large LLM outputs, e.g., Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic large models for high-quality data synthesis, advocating instead for strategic coordination of smaller agents. Our datasets, models, and code are publicly available at https://github.com/GX-XinGao/GRA.

小型LLMの戦略的調整フレームワークは、データ合成において大型LLMに匹敵する

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

要旨

Support