通過口語化操作進行指示遵循評估

摘要

儘管調整指令的模型在各種自然語言處理任務中取得了顯著成功，但準確評估其遵循指令的能力仍然具有挑戰性。現有的基準主要集中在與模型在訓練期間學習的內容相符的常見指令上。然而，對這些指令的回應能力並不一定意味著具有強大的遵循指令能力。在本文中，我們提出了一種名為「口語化操作」的新型指令遵循評估協議。它指示模型用與模型先驗知識程度不同程度相符的詞語來口頭表達任務標籤，從高度對齊（例如，對於正面情感輸出“正面”）到最小程度對齊（例如，對於正面情感輸出“負面”）。口語化操作可以與任何分類基準無縫集成，以檢查模型對先驗知識的依賴程度以及其覆蓋它們以準確遵循指令的能力。我們對四個主要模型系列在九個數據集上進行了全面評估，對每個模型系列使用了十二組口語化操作。我們觀察到，模型在遵循指令的能力上，跨不同系列和規模，明顯地取決於它們對於不太自然口語化操作的表現。即使最強大的 GPT-4 模型在最具挑戰性的口語化操作上也難以比隨機猜測表現更好，強調了繼續改進其遵循指令能力的必要性。

English

While instruction-tuned models have shown remarkable success in various natural language processing tasks, accurately evaluating their ability to follow instructions remains challenging. Existing benchmarks primarily focus on common instructions that align well with what the model learned during training. However, proficiency in responding to these instructions does not necessarily imply strong ability in instruction following. In this paper, we propose a novel instruction-following evaluation protocol called verbalizer manipulation. It instructs the model to verbalize the task label with words aligning with model priors to different extents, adopting verbalizers from highly aligned (e.g., outputting ``postive'' for positive sentiment), to minimally aligned (e.g., outputting ``negative'' for positive sentiment). Verbalizer manipulation can be seamlessly integrated with any classification benchmark to examine the model's reliance on priors and its ability to override them to accurately follow the instructions. We conduct a comprehensive evaluation of four major model families across nine datasets, employing twelve sets of verbalizers for each of them. We observe that the instruction-following abilities of models, across different families and scales, are significantly distinguished by their performance on less natural verbalizers. Even the strongest GPT-4 model struggles to perform better than random guessing on the most challenging verbalizer, emphasizing the need for continued advancements to improve their instruction-following abilities.

通過口語化操作進行指示遵循評估

Instruction-following Evaluation through Verbalizer Manipulation

摘要

Support