CoT百科全書:分析、預測與控制推理模型的思維方式
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
May 15, 2025
作者: Seongyun Lee, Seungone Kim, Minju Seo, Yongrae Jo, Dongyoung Go, Hyeonbin Hwang, Jinho Park, Xiang Yue, Sean Welleck, Graham Neubig, Moontae Lee, Minjoon Seo
cs.AI
摘要
長鏈思維(CoT)是有效運用現代大型語言模型的關鍵要素,然而我們對這些能力背後的推理策略理解仍顯不足。儘管先前一些研究嘗試使用預定義的策略類型來分類CoT,但這類方法受限於人類直覺,無法全面捕捉模型行為的多樣性。在本研究中,我們引入了CoT百科全書,這是一種自下而上的框架,用於分析和引導模型推理。我們的方法自動從模型生成的CoT中提取多樣的推理標準,將其嵌入語義空間,聚類成代表性類別,並推導出對比性評分標準來解釋推理行為。人類評估顯示,該框架產生的分析比現有方法更具解釋性和全面性。此外,我們證明這種理解能夠提升性能:我們可以預測模型可能使用的策略,並引導其轉向更有效的替代方案。最後,我們提供了實用見解,例如訓練數據格式(如自由形式與多選題)對推理行為的影響遠大於數據領域,這凸顯了格式感知模型設計的重要性。
English
Long chain-of-thought (CoT) is an essential ingredient in effective usage of
modern large language models, but our understanding of the reasoning strategies
underlying these capabilities remains limited. While some prior works have
attempted to categorize CoTs using predefined strategy types, such approaches
are constrained by human intuition and fail to capture the full diversity of
model behaviors. In this work, we introduce the CoT Encyclopedia, a bottom-up
framework for analyzing and steering model reasoning. Our method automatically
extracts diverse reasoning criteria from model-generated CoTs, embeds them into
a semantic space, clusters them into representative categories, and derives
contrastive rubrics to interpret reasoning behavior. Human evaluations show
that this framework produces more interpretable and comprehensive analyses than
existing methods. Moreover, we demonstrate that this understanding enables
performance gains: we can predict which strategy a model is likely to use and
guide it toward more effective alternatives. Finally, we provide practical
insights, such as that training data format (e.g., free-form vs.
multiple-choice) has a far greater impact on reasoning behavior than data
domain, underscoring the importance of format-aware model design.Summary
AI-Generated Summary