當思維遇見事實：長上下文語言模型的可重用推理

摘要

近期發展的長上下文語言模型（LCLMs）能夠在單一提示中處理數十萬個標記，這為知識密集型的多跳推理開闢了新機會，通過整合大量檢索到的文件，或在某些情況下直接包含所有必要資訊。然而，僅僅將更多文件輸入上下文窗口並未能捕捉證據應如何連接。我們通過思維模板來解決這一差距，這些模板將推理重新構建為可重用的思維緩存，源自先前的問題解決軌跡，結構化證據的結合方式，並以事實文件指導多跳推理。為了保持這些模板的有效性，我們提出了一種更新策略，通過自然語言反饋迭代地從訓練數據中精煉模板。在多樣化的基準測試和LCLM家族中，我們的方法在基於檢索和非檢索的設置下均展現出對強基線的持續提升。此外，我們展示了優化後的模板可以被蒸餾到更小的開源模型中，證明了其廣泛的適用性和透明的推理重用。我們將此框架稱為思維模板增強型長上下文語言模型（ToTAL）。

English

Recent Long-Context Language Models (LCLMs) can process hundreds of thousands of tokens in a single prompt, enabling new opportunities for knowledge-intensive multi-hop reasoning by integrating large sets of retrieved documents or, in some cases, directly all necessary information. However, simply feeding more documents into the context window fails to capture how evidence should be connected. We address this gap with thought templates, which recast reasoning as reusable thought caches, derived from prior problem solving traces, structuring how evidence is combined and guiding multi-hop inference with factual documents. To keep these templates effective, we propose an update strategy that iteratively refines templates derived from training data through natural-language feedback. Across diverse benchmarks and LCLM families, our approach delivers consistent gains over strong baselines in both retrieval-based and retrieval-free settings. Furthermore, we show that optimized templates can be distilled into smaller open-source models, demonstrating its broad applicability and transparent reasoning reuse. We refer to our framework as Thought Template Augmented LCLMs (ToTAL).

當思維遇見事實：長上下文語言模型的可重用推理

When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

摘要

Support