ChatPaper.aiChatPaper

探索指令微調的格式一致性

Exploring Format Consistency for Instruction Tuning

July 28, 2023
作者: Shihao Liang, Kunlun Zhu, Runchu Tian, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun
cs.AI

摘要

指令調整已成為增強大型語言模型以遵循人類指令的一種有前途的方法。研究表明,在訓練數據中增加指令的多樣性和數量可以持續增強泛化性能,這有助於最近的一項努力,即收集各種指令並將現有的指令調整數據集整合到更大的集合中。然而,不同用戶有其獨特的表達指令方式,而不同數據集之間的指令風格和格式通常存在變化,即格式不一致性。在這項工作中,我們研究了格式不一致性如何影響指令調整的性能。我們提出了一個名為「統一指令調整」(UIT)的框架,該框架調用 OpenAI API 在不同的指令調整數據集之間進行自動格式轉換。我們展示了 UIT 成功提高了對未見指令的泛化性能,突顯了格式一致性對指令調整的重要性。為了使 UIT 框架更實用,我們進一步提出了一種基於困惑度的新型去噪方法,以減少自動格式轉換的噪音。我們還訓練了一個較小的離線模型,其實現了與 OpenAI API 相當的格式轉換能力,以在實踐中降低成本。
English
Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger collections. However, different users have their unique ways of expressing instructions, and there often exist variations across different datasets in the instruction styles and formats, i.e., format inconsistency. In this work, we study how format inconsistency may impact the performance of instruction tuning. We propose a framework called "Unified Instruction Tuning" (UIT), which calls OpenAI APIs for automatic format transfer among different instruction tuning datasets. We show that UIT successfully improves the generalization performance on unseen instructions, which highlights the importance of format consistency for instruction tuning. To make the UIT framework more practical, we further propose a novel perplexity-based denoising method to reduce the noise of automatic format transfer. We also train a smaller offline model that achieves comparable format transfer capability than OpenAI APIs to reduce costs in practice.
PDF80December 15, 2024