小型语言模型战略协同框架在数据合成中媲美大型语言模型

摘要

尽管数据合成与蒸馏是增强小型语言模型的有效策略，但现有方法严重依赖大型语言模型（LLMs），这些模型存在计算成本高、环境效率低以及从单一架构中继承潜在偏见等问题。相比之下，小型LLMs更具可访问性和可持续性，但其个体能力在生成高质量、多样且可靠的数据方面往往不足。受人类协作过程（如同行评审）的启发，我们提出了一个多小型LLMs参与的框架——GRA，该框架通过聚合小型LLMs的专门角色，实现通常由单一大型LLM完成的迭代优化与质量控制。在这一协作框架中，多个小型LLMs承担不同角色——生成器、评审员和仲裁者，以模拟一个受同行评审启发的数据合成流程。生成器提出初始数据样本，评审员对其质量和多样性进行评价，仲裁者则解决冲突以最终确定输出。通过将合成过程分解为专门子任务，协作的小型LLMs能够在数据层面达到与基于大型LLM蒸馏相当的水平。通过多项基准测试，我们证明GRA生成的数据质量与单一大型LLM（如Qwen-2.5-72B-Instruct）的输出相当甚至更优。我们的研究结果挑战了高质量数据合成必须依赖单一大型模型的必要性，转而提倡对小型智能体进行战略协调。我们的数据集、模型及代码已公开于https://github.com/GX-XinGao/GRA。

English

While data synthesis and distillation are promising strategies to enhance small language models, current approaches heavily rely on Large Language Models (LLMs), which suffer from high computational costs, environmental inefficiency, and potential biases inherited from monolithic architectures. In contrast, smaller LLMs are more accessible and sustainable, but their individual capabilities often fall short in generating high-quality, diverse, and reliable data. Inspired by collaborative human processes (e.g., peer review), we propose a multiple small LLMs involved framework, GRA, that aggregates specialized roles across small LLMs to iterative refinement and quality control typically achieved by a single large LLM. In this collaborative framework, multiple small LLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate a peer-review-inspired data synthesis pipeline. The Generator proposes initial data samples, the Reviewer critiques their quality and diversity, and the Adjudicator resolves conflicts to finalize the output. By decomposing the synthesis process into specialized sub-tasks, collaborative small LLMs can achieve data-level parity with large LLM-based distillation. Through experiments across multiple benchmarks, we demonstrate that GRA-produced data matches or exceeds the quality of single large LLM outputs, e.g., Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic large models for high-quality data synthesis, advocating instead for strategic coordination of smaller agents. Our datasets, models, and code are publicly available at https://github.com/GX-XinGao/GRA.

小型语言模型战略协同框架在数据合成中媲美大型语言模型

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

摘要

Support