当思维遇见事实:长上下文语言模型的可复用推理
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
October 8, 2025
作者: Soyeong Jeong, Taehee Jung, Sung Ju Hwang, Joo-Kyung Kim, Dongyeop Kang
cs.AI
摘要
近期,长上下文语言模型(LCLMs)能够一次性处理数十万标记,这为知识密集型多跳推理开辟了新途径,通过整合大量检索文档或直接包含所有必要信息。然而,仅简单地将更多文档输入上下文窗口,未能有效捕捉证据间的关联方式。我们通过思维模板填补了这一空白,这些模板将推理重塑为可重复使用的思维缓存,源自先前的解题轨迹,结构化证据的组合方式,并指导基于事实文档的多跳推理。为确保这些模板的有效性,我们提出了一种更新策略,通过自然语言反馈迭代优化从训练数据中提取的模板。在多种基准测试和LCLM家族中,我们的方法在基于检索和无检索的设定下均显著超越了强基线。此外,我们展示了优化后的模板可被蒸馏至更小的开源模型中,证明了其广泛的适用性和透明的推理复用性。我们将此框架称为“思维模板增强的长上下文语言模型”(ToTAL)。
English
Recent Long-Context Language Models (LCLMs) can process hundreds of thousands
of tokens in a single prompt, enabling new opportunities for
knowledge-intensive multi-hop reasoning by integrating large sets of retrieved
documents or, in some cases, directly all necessary information. However,
simply feeding more documents into the context window fails to capture how
evidence should be connected. We address this gap with thought templates, which
recast reasoning as reusable thought caches, derived from prior problem solving
traces, structuring how evidence is combined and guiding multi-hop inference
with factual documents. To keep these templates effective, we propose an update
strategy that iteratively refines templates derived from training data through
natural-language feedback. Across diverse benchmarks and LCLM families, our
approach delivers consistent gains over strong baselines in both
retrieval-based and retrieval-free settings. Furthermore, we show that
optimized templates can be distilled into smaller open-source models,
demonstrating its broad applicability and transparent reasoning reuse. We refer
to our framework as Thought Template Augmented LCLMs (ToTAL).