揭示用於邏輯推理的算法演繹電路

摘要

近期研究顯示，大型語言模型（LLMs）可透過融入功能性符號表徵（抽象描述圖遍歷演算法及逐步推理），在少量樣本學習的情境中展現出色的推理能力。然而，在僅有少量示範的條件下，LLMs 如何真正理解每個推理步驟的抽象意義及整體演算法，仍屬未解之謎。本研究旨在定位負責各推理步驟的注意力頭，並刻畫其間傳遞的資訊類型。我們首先在符號輔助的思維鏈（CoT）提示框架下，將構成的推理步驟與對應的詞元邏輯值進行對齊。分析顯示，主導推理過程的詞元位置與因示範中推理行為模式約束而產生的低信心分數相關。接著，我們採用因果中介分析技術，識別負責這些模式的注意力頭。此外，研究結果指出，LLMs 透過特化的注意力頭（約佔全部頭的3%）為個別子推理任務擷取事實與規則為基礎的資訊，而較高層則主要促進資訊整合及全局推理策略（例如圖遍歷演算法）的湧現，此類策略協調多個中間推理步驟以解決整體任務。

English

Recent studies have shown that Large Language Models (LLMs) can achieve strong reasoning performance by incorporating functional symbolic representations that abstractly describe graph traversal algorithms and step-by-step reasoning in few-shot learning settings. However, it remains unclear how LLMs genuinely understand the abstract meaning of each reasoning step and the overall algorithm from only a limited number of demonstrations. This work aims to localize the attention heads responsible for individual reasoning steps and characterize the types of information transferred among them. We first align constituent reasoning steps with their corresponding token logits under a symbolic-aided Chain-of-Thought (CoT) prompting framework. Our analysis shows that token positions that steer the reasoning process are associated with low confidence scores caused by constraints on satisfying reasoning behavior patterns in demonstrations. We then adopt causal mediation analysis techniques to identify the attention heads responsible for these patterns. In addition, our findings indicate that LLMs retrieve factual and rule-based information for individual sub-reasoning tasks through specialized attention heads (approximately 3% total heads), whereas higher layers predominantly facilitate information integration and the emergence of global reasoning strategies (e.g., graph traversal algorithms) that coordinate multiple intermediate reasoning steps to solve the overall task.