重新格式化的对齐
Reformatted Alignment
February 19, 2024
作者: Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu
cs.AI
摘要
微调数据的质量对于使大型语言模型(LLMs)与人类价值观保持一致至关重要。目前改善数据质量的方法要么需要大量人力,要么容易出现由LLM幻觉引起的事实错误。本文探讨了提升现有指导数据质量以更好地与人类价值观保持一致的方法,引入了一种名为ReAlign的简单有效方法,该方法将指导数据的响应重新格式化为更符合预先确定标准和汇编证据的格式。这种方法最大程度地减少了人工注释、幻觉和扩展困难,与现有的对齐技术保持正交。在实验中,ReAlign显著提升了LLMs的一般对齐能力、数学推理能力、事实性和可读性。
令人鼓舞的是,在不引入任何额外数据或高级训练技术的情况下,仅通过重新格式化响应,LLaMA-2-13B在GSM8K上的数学推理能力从46.77%提高到56.63%的准确率。此外,仅使用5%的ReAlign数据,在Alpaca数据集上测量的一般对齐能力提升了67%。这项工作凸显了对LLMs的科学和机械解释能力的进一步研究的必要性。我们已经公开了相关代码和数据,以支持未来研究,网址为https://github.com/GAIR-NLP/ReAlign。
English
The quality of finetuning data is crucial for aligning large language models
(LLMs) with human values. Current methods to improve data quality are either
labor-intensive or prone to factual errors caused by LLM hallucinations. This
paper explores elevating the quality of existing instruction data to better
align with human values, introducing a simple and effective approach named
ReAlign, which reformats the responses of instruction data into a format that
better aligns with pre-established criteria and the collated evidence. This
approach minimizes human annotation, hallucination, and the difficulty in
scaling, remaining orthogonal to existing alignment techniques. Experimentally,
ReAlign significantly boosts the general alignment ability, math reasoning,
factuality, and readability of the LLMs.
Encouragingly, without introducing any additional data or advanced training
techniques, and merely by reformatting the response, LLaMA-2-13B's mathematical
reasoning ability on GSM8K can be improved from 46.77% to 56.63% in accuracy.
Additionally, a mere 5% of ReAlign data yields a 67% boost in general alignment
ability measured by the Alpaca dataset. This work highlights the need for
further research into the science and mechanistic interpretability of LLMs. We
have made the associated code and data publicly accessible to support future
studies at https://github.com/GAIR-NLP/ReAlign.Summary
AI-Generated Summary