符號調整改善語言模型中的上下文學習

摘要

我們提出符號微調 - 在上下文輸入-標籤對上微調語言模型，其中自然語言標籤（例如，“正面/負面情感”）被任意符號（例如，“foo/bar”）取代。符號微調利用這樣的直覺，即當模型無法使用指示或自然語言標籤來理解任務時，必須透過學習輸入-標籤映射來實現。我們在 Flan-PaLM 模型上進行符號微調實驗，涵蓋高達 540B 參數，觀察到在各種設置下的好處。首先，符號微調提升了在看不見的上下文學習任務上的表現，對於指示不足或沒有自然語言標籤的提示更加強健。其次，經符號微調的模型在算法推理任務上表現更為強勁，在列表功能基準測試中表現提升高達 18.2%，在簡單圖靈概念基準測試中表現提升高達 15.3%。最後，經符號微調的模型在跟隨上下文中呈現的翻轉標籤方面有顯著改善，這意味著它們更能夠利用上下文信息來覆蓋先前的語義知識。

English

We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label mappings. We experiment with symbol tuning across Flan-PaLM models up to 540B parameters and observe benefits across various settings. First, symbol tuning boosts performance on unseen in-context learning tasks and is much more robust to underspecified prompts, such as those without instructions or without natural language labels. Second, symbol-tuned models are much stronger at algorithmic reasoning tasks, with up to 18.2% better performance on the List Functions benchmark and up to 15.3% better performance on the Simple Turing Concepts benchmark. Finally, symbol-tuned models show large improvements in following flipped-labels presented in-context, meaning that they are more capable of using in-context information to override prior semantic knowledge.

符號調整改善語言模型中的上下文學習

Symbol tuning improves in-context learning in language models

摘要

Support