通过最小调整来解锁LLMs的长输出,高质量数据是关键。
Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key
October 14, 2024
作者: Yingda Chen, Xingjun Wang, Jintao Huang, Yunlin Mao, Daoze Zhang, Yuze Zhao
cs.AI
摘要
随着大型语言模型迅速发展以支持更长的上下文,它们在生成更长输出方面的能力存在明显的差异。最近的研究表明,这种不平衡的主要原因可能源自在对齐训练过程中缺乏长输出数据。鉴于这一观察结果,人们尝试重新对齐基础模型与填补这一空白的数据,从而使模型能够在指导下生成较长的输出。在本文中,我们探讨了通过调整模型以实现长输出的数据质量对其影响,以及从人类对齐(指导或聊天)模型的起点开始实现这一目标的可能性。通过精心筛选数据,我们展示了在我们调整后的模型中,只需少量训练数据实例和计算资源即可实现类似的性能改进。此外,我们通过将我们的调整方法应用于多个模型来评估这种方法的泛化能力。我们的研究结果表明,尽管不同模型在开箱即用时生成长输出的能力有所不同,但我们使用轻量计算资源通过高质量数据调整它们的方法在我们实验的所有模型中始终表现出显著的改进。我们已公开了用于调整长文本能力的筛选数据集,模型调整和评估的实现,以及经过微调的模型,所有这些都可以公开获取。
English
As large language models rapidly evolve to support longer context, there is a
notable disparity in their capability to generate output at greater lengths.
Recent study suggests that the primary cause for this imbalance may arise from
the lack of data with long-output during alignment training. In light of this
observation, attempts are made to re-align foundation models with data that
fills the gap, which result in models capable of generating lengthy output when
instructed. In this paper, we explore the impact of data-quality in tuning a
model for long output, and the possibility of doing so from the starting points
of human-aligned (instruct or chat) models. With careful data curation, we show
that it possible to achieve similar performance improvement in our tuned
models, with only a small fraction of training data instances and compute. In
addition, we assess the generalizability of such approaches by applying our
tuning-recipes to several models. our findings suggest that, while capacities
for generating long output vary across different models out-of-the-box, our
approach to tune them with high-quality data using lite compute, consistently
yields notable improvement across all models we experimented on. We have made
public our curated dataset for tuning long-writing capability, the
implementations of model tuning and evaluation, as well as the fine-tuned
models, all of which can be openly-accessed.Summary
AI-Generated Summary