ChatPaper.aiChatPaper

大型語言模型的注意力頭部研究綜述

Attention Heads of Large Language Models: A Survey

September 5, 2024
作者: Zifan Zheng, Yezhaohui Wang, Yuxin Huang, Shichao Song, Bo Tang, Feiyu Xiong, Zhiyu Li
cs.AI

摘要

自ChatGPT問世以來,大型語言模型(LLMs)在各類任務中表現卓越,但其運作機制仍多屬黑箱系統。這導致當前發展高度依賴數據驅動方法,難以透過調整內部架構與推理路徑來提升效能。因此,許多研究者開始探索LLMs的潛在內部機制,試圖釐清其推理瓶頸的本質,其中多數研究聚焦於注意力頭部。本綜述旨在透過解析注意力頭部的可解釋性與底層機制,揭示LLMs的內部推理過程。我們首先將人類思維流程提煉為四階段框架:知識喚起、上下文識別、潛在推理與表達準備,並以此系統性回顧既有研究,對特定注意力頭部的功能進行識別與歸類。此外,我們歸納了發現這些特殊注意力頭部的實驗方法,將其分為無需建模與需建模兩大類別,同時概述相關評估方法與基準測試。最後,我們討論現有研究的局限性,並提出多個潛在的未來研究方向。本文參考文獻清單已開源於:https://github.com/IAAR-Shanghai/Awesome-Attention-Heads。
English
Since the advent of ChatGPT, Large Language Models (LLMs) have excelled in various tasks but remain largely as black-box systems. Consequently, their development relies heavily on data-driven approaches, limiting performance enhancement through changes in internal architecture and reasoning pathways. As a result, many researchers have begun exploring the potential internal mechanisms of LLMs, aiming to identify the essence of their reasoning bottlenecks, with most studies focusing on attention heads. Our survey aims to shed light on the internal reasoning processes of LLMs by concentrating on the interpretability and underlying mechanisms of attention heads. We first distill the human thought process into a four-stage framework: Knowledge Recalling, In-Context Identification, Latent Reasoning, and Expression Preparation. Using this framework, we systematically review existing research to identify and categorize the functions of specific attention heads. Furthermore, we summarize the experimental methodologies used to discover these special heads, dividing them into two categories: Modeling-Free methods and Modeling-Required methods. Also, we outline relevant evaluation methods and benchmarks. Finally, we discuss the limitations of current research and propose several potential future directions. Our reference list is open-sourced at https://github.com/IAAR-Shanghai/Awesome-Attention-Heads.
PDF925November 14, 2024