小型语言模型战略协同框架在数据合成中媲美大型语言模型
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
April 11, 2025
作者: Xin Gao, Qizhi Pei, Zinan Tang, Yu Li, Honglin Lin, Jiang Wu, Conghui He, Lijun Wu
cs.AI
摘要
尽管数据合成与蒸馏是增强小型语言模型的有效策略,但现有方法严重依赖大型语言模型(LLMs),这些模型存在计算成本高、环境效率低以及从单一架构中继承潜在偏见等问题。相比之下,小型LLMs更具可访问性和可持续性,但其个体能力在生成高质量、多样且可靠的数据方面往往不足。受人类协作过程(如同行评审)的启发,我们提出了一个多小型LLMs参与的框架——GRA,该框架通过聚合小型LLMs的专门角色,实现通常由单一大型LLM完成的迭代优化与质量控制。在这一协作框架中,多个小型LLMs承担不同角色——生成器、评审员和仲裁者,以模拟一个受同行评审启发的数据合成流程。生成器提出初始数据样本,评审员对其质量和多样性进行评价,仲裁者则解决冲突以最终确定输出。通过将合成过程分解为专门子任务,协作的小型LLMs能够在数据层面达到与基于大型LLM蒸馏相当的水平。通过多项基准测试,我们证明GRA生成的数据质量与单一大型LLM(如Qwen-2.5-72B-Instruct)的输出相当甚至更优。我们的研究结果挑战了高质量数据合成必须依赖单一大型模型的必要性,转而提倡对小型智能体进行战略协调。我们的数据集、模型及代码已公开于https://github.com/GX-XinGao/GRA。
English
While data synthesis and distillation are promising strategies to enhance
small language models, current approaches heavily rely on Large Language Models
(LLMs), which suffer from high computational costs, environmental inefficiency,
and potential biases inherited from monolithic architectures. In contrast,
smaller LLMs are more accessible and sustainable, but their individual
capabilities often fall short in generating high-quality, diverse, and reliable
data. Inspired by collaborative human processes (e.g., peer review), we propose
a multiple small LLMs involved framework, GRA, that aggregates specialized
roles across small LLMs to iterative refinement and quality control typically
achieved by a single large LLM. In this collaborative framework, multiple small
LLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate a
peer-review-inspired data synthesis pipeline. The Generator proposes initial
data samples, the Reviewer critiques their quality and diversity, and the
Adjudicator resolves conflicts to finalize the output. By decomposing the
synthesis process into specialized sub-tasks, collaborative small LLMs can
achieve data-level parity with large LLM-based distillation. Through
experiments across multiple benchmarks, we demonstrate that GRA-produced data
matches or exceeds the quality of single large LLM outputs, e.g.,
Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic large
models for high-quality data synthesis, advocating instead for strategic
coordination of smaller agents. Our datasets, models, and code are publicly
available at https://github.com/GX-XinGao/GRA.Summary
AI-Generated Summary