심볼 튜닝은 언어 모델의 인컨텍스트 학습 성능을 향상시킵니다.

초록

우리는 심볼 튜닝(symbol tuning)을 제안합니다. 이는 자연어 레이블(예: "긍정적/부정적 감정")을 임의의 심볼(예: "foo/bar")로 대체한 입력-레이블 쌍을 컨텍스트 내에서 언어 모델에 미세 조정하는 방법입니다. 심볼 튜닝은 모델이 작업을 이해하기 위해 지시문이나 자연어 레이블을 사용할 수 없을 때, 대신 입력-레이블 매핑을 학습해야 한다는 직관을 활용합니다. 우리는 540B 파라미터 규모의 Flan-PaLM 모델에 걸쳐 심볼 튜닝을 실험하고 다양한 설정에서의 이점을 관찰했습니다. 첫째, 심볼 튜닝은 보지 못한 컨텍스트 내 학습 작업에서 성능을 향상시키며, 지시문이나 자연어 레이블이 없는 불완전한 프롬프트에 대해 훨씬 더 강건합니다. 둘째, 심볼 튜닝된 모델은 알고리즘적 추론 작업에서 훨씬 더 강력한 성능을 보이며, List Functions 벤치마크에서 최대 18.2%, Simple Turing Concepts 벤치마크에서 최대 15.3% 더 나은 성능을 달성했습니다. 마지막으로, 심볼 튜닝된 모델은 컨텍스트 내에서 제시된 뒤집힌 레이블을 따라가는 데 큰 개선을 보여, 이전의 의미론적 지식을 재정의하기 위해 컨텍스트 내 정보를 사용하는 능력이 더 뛰어납니다.

English

We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label mappings. We experiment with symbol tuning across Flan-PaLM models up to 540B parameters and observe benefits across various settings. First, symbol tuning boosts performance on unseen in-context learning tasks and is much more robust to underspecified prompts, such as those without instructions or without natural language labels. Second, symbol-tuned models are much stronger at algorithmic reasoning tasks, with up to 18.2% better performance on the List Functions benchmark and up to 15.3% better performance on the Simple Turing Concepts benchmark. Finally, symbol-tuned models show large improvements in following flipped-labels presented in-context, meaning that they are more capable of using in-context information to override prior semantic knowledge.

심볼 튜닝은 언어 모델의 인컨텍스트 학습 성능을 향상시킵니다.

Symbol tuning improves in-context learning in language models

초록

Support