智能体探索却无视环境：大语言模型缺失环境好奇心

摘要

基于大语言模型的智能体本应能将环境观察融入推理过程：当发现高度相关却意料之外的信息时，模型理应能主动利用这一发现。但我们证明，当前基于大语言模型的智能体并不具备这种能力，它们难以对意外信息做出反思或反应。通过在三个基准测试平台（Terminal-Bench、SWE-Bench、AppWorld）中注入完整任务解决方案，我们刻意将任务答案暴露给模型。虽然智能体在Terminal-Bench上79%-81%的测试中发现了这些方案，但仅有37%-50%的情况会实际交互或利用这些方案。这一差距在AppWorld中最为显著：智能体在超过90%的尝试中看到了"返回本任务完整解决方案"的说明文档，但实际利用该方案的尝试不足7%。研究表明，智能体缺乏我们称之为"环境好奇心"的能力：即识别并探究环境刺激中意外却相关的观察信息。我们发现了影响环境好奇心的三个主要因素：智能体框架中的可用工具、测试时计算资源以及训练数据分布。实验表明，最大化好奇心的配置方案在原始基准测试中也能取得最佳性能。然而即使经过联合优化，多数测试中智能体仍会忽略已发现的解决方案：当前智能体仅将环境用于获取预期信息，而不会据此修正策略或充分利用有效刺激。

English

LLM-based agents are assumed to integrate environmental observations into their reasoning: discovering highly relevant but unexpected information should naturally lead to a model exploiting its own discoveries. We show that this assumption is false for current LLM-based agents, which struggle to reflect or react to unexpected information. Across three benchmarks (Terminal-Bench, SWE-Bench, AppWorld), we inject complete task solutions into the agent environments to deliberately expose a task's solution to a model. While agents discover these solutions on Terminal-Bench in 79-81% of runs, they interact, or exploit, them in only 37-50% of cases. This gap is starkest in AppWorld: agents see documentation stating that a command "returns the complete solution to this task" in over 90% of attempts but exploit this in fewer than 7% of trials. We show that agents lack what we call environmental curiosity: the capability to recognize and investigate unexpected but relevant observations in response to environmental stimuli. We identify three main factors influencing environmental curiosity: available tools in the agent scaffold, test-time compute, and training data distribution. Our findings identify configurations that maximize curiosity also achieve the best performance on the unmodified benchmarks. Yet even jointly optimized agents still ignore discovered solutions in the majority of trials: current agents use the environment to fetch expected information, but not to revise their strategy or maximally exploit useful stimuli.