强制思考反致用户参与型智能体表现内敛：过度思考如何削弱LLM代理的交互效能

摘要

诱导推理已成为通过激发思考来提升大语言模型（LLMs）在复杂任务上表现的有效技术。然而，其在真实用户参与型智能体场景中的有效性尚不明确。本文对用户参与型LLM智能体中显式思考的影响进行了系统性研究。我们的实验涵盖七个模型、三个基准测试平台和两种思考实现方式，并通过定量响应分类分析和定性错误传播案例研究进行评估。与预期相反，我们发现强制思考在用户参与场景中往往适得其反，导致各类LLMs出现异常性能下降。关键发现表明：思考会使智能体趋于"内向化"，表现为回复缩短和向用户披露的信息减少，这削弱了智能体与用户间的信息交换，进而引发下游任务失败。进一步实验证明，明确提示信息披露能稳定提升不同模型家族的性能，表明主动透明化是优化智能体的关键杠杆。总体而言，我们的研究表明信息透明意识是未来现实场景推理智能体设计中至关重要却尚未充分探索的维度。代码已开源：https://github.com/deeplearning-wisc/Thinking-Agent。

English

Eliciting reasoning has emerged as a powerful technique for improving the performance of large language models (LLMs) on complex tasks by inducing thinking. However, their effectiveness in realistic user-engaged agent scenarios remains unclear. In this paper, we conduct a comprehensive study on the effect of explicit thinking in user-engaged LLM agents. Our experiments span across seven models, three benchmarks, and two thinking instantiations, and we evaluate them through both a quantitative response taxonomy analysis and qualitative failure propagation case studies. Contrary to expectations, we find that mandatory thinking often backfires on agents in user-engaged settings, causing anomalous performance degradation across various LLMs. Our key finding reveals that thinking makes agents more ``introverted'' by shortening responses and reducing information disclosure to users, which weakens agent-user information exchange and leads to downstream task failures. Furthermore, we demonstrate that explicitly prompting for information disclosure reliably improves performance across diverse model families, suggesting that proactive transparency is a vital lever for agent optimization. Overall, our study suggests that information transparency awareness is a crucial yet underexplored perspective for the future design of reasoning agents in real-world scenarios. Our code is available at https://github.com/deeplearning-wisc/Thinking-Agent.

强制思考反致用户参与型智能体表现内敛：过度思考如何削弱LLM代理的交互效能

Thinking Makes LLM Agents Introverted: How Mandatory Thinking Can Backfire in User-Engaged Agents

摘要

Support