踩踏调整：通过自举扩展调整LLM的自对齐能力

摘要

自我对齐是降低人工标注成本、同时确保模型能力的有效方法。然而，大多数当前方法在单轮完成数据收集和训练步骤，可能忽视自我对齐模型不断提升的能力。这引发了一个关键问题：如果我们进行多次自举自我对齐会怎样？这种策略是否会提升模型性能或导致快速退化？在本文中，我们的开创性探索深入研究了自举自我对齐对大型语言模型的影响。我们的发现显示，通过保证来自上下文学习的数据多样性，自举自我对齐明显优于单轮方法。为了进一步发挥自举的能力，我们研究并调整数据的训练顺序，从而提高模型的性能。基于这些发现，我们提出了Step-On-Feet Tuning（SOFT），利用模型持续增强的少样本能力来提升零样本或一样本的性能。基于由易到难的训练配方，我们提出了SOFT+，进一步提升自我对齐的性能。我们的实验表明，SOFT（SOFT+）在各种分类和生成任务中展现了高效性，突显了自举自我对齐对持续增强模型对齐性能的潜力。

English

Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrapping self-alignment? Does this strategy enhance model performance or lead to rapid degradation? In this paper, our pioneering exploration delves into the impact of bootstrapping self-alignment on large language models. Our findings reveal that bootstrapping self-alignment markedly surpasses the single-round approach, by guaranteeing data diversity from in-context learning. To further exploit the capabilities of bootstrapping, we investigate and adjust the training order of data, which yields improved performance of the model. Drawing on these findings, we propose Step-On-Feet Tuning (SOFT) which leverages model's continuously enhanced few-shot ability to boost zero or one-shot performance. Based on easy-to-hard training recipe, we propose SOFT+ which further boost self-alignment's performance. Our experiments demonstrate the efficiency of SOFT (SOFT+) across various classification and generation tasks, highlighting the potential of bootstrapping self-alignment on continually enhancing model alignment performance.

踩踏调整：通过自举扩展调整LLM的自对齐能力

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

摘要

Support