通过最小调整来解锁LLMs的长输出，高质量数据是关键。

摘要

随着大型语言模型迅速发展以支持更长的上下文，它们在生成更长输出方面的能力存在明显的差异。最近的研究表明，这种不平衡的主要原因可能源自在对齐训练过程中缺乏长输出数据。鉴于这一观察结果，人们尝试重新对齐基础模型与填补这一空白的数据，从而使模型能够在指导下生成较长的输出。在本文中，我们探讨了通过调整模型以实现长输出的数据质量对其影响，以及从人类对齐（指导或聊天）模型的起点开始实现这一目标的可能性。通过精心筛选数据，我们展示了在我们调整后的模型中，只需少量训练数据实例和计算资源即可实现类似的性能改进。此外，我们通过将我们的调整方法应用于多个模型来评估这种方法的泛化能力。我们的研究结果表明，尽管不同模型在开箱即用时生成长输出的能力有所不同，但我们使用轻量计算资源通过高质量数据调整它们的方法在我们实验的所有模型中始终表现出显著的改进。我们已公开了用于调整长文本能力的筛选数据集，模型调整和评估的实现，以及经过微调的模型，所有这些都可以公开获取。

English

As large language models rapidly evolve to support longer context, there is a notable disparity in their capability to generate output at greater lengths. Recent study suggests that the primary cause for this imbalance may arise from the lack of data with long-output during alignment training. In light of this observation, attempts are made to re-align foundation models with data that fills the gap, which result in models capable of generating lengthy output when instructed. In this paper, we explore the impact of data-quality in tuning a model for long output, and the possibility of doing so from the starting points of human-aligned (instruct or chat) models. With careful data curation, we show that it possible to achieve similar performance improvement in our tuned models, with only a small fraction of training data instances and compute. In addition, we assess the generalizability of such approaches by applying our tuning-recipes to several models. our findings suggest that, while capacities for generating long output vary across different models out-of-the-box, our approach to tune them with high-quality data using lite compute, consistently yields notable improvement across all models we experimented on. We have made public our curated dataset for tuning long-writing capability, the implementations of model tuning and evaluation, as well as the fine-tuned models, all of which can be openly-accessed.

通过最小调整来解锁LLMs的长输出，高质量数据是关键。

Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key

摘要

Support