에이전트는 탐험하지만 환경을 무시한다: LLM의 환경 호기심 부재

초록

LLM 기반 에이전트는 환경 관찰을 자신의 추론 과정에 통합할 것으로 가정됩니다: 매우 관련성이 높지만 예상치 못한 정보를 발견했을 때 모델이 스스로의 발견을 활용하는 방향으로 자연스럽게 진행되어야 합니다. 본 연구는 이러한 가정이 현재의 LLM 기반 에이전트에게는 거짓임을 보여줍니다. 이들은 예상치 못한 정보를 반영하거나 이에 대응하는 데 어려움을 겪습니다. 세 가지 벤치마크(Terminal-Bench, SWE-Bench, AppWorld)에서 에이전트 환경에 완전한 과제 해결책을 주입하여 모델이 과제의 해결책을 의도적으로 접하도록 했습니다. 터미널-벤치에서는 에이전트가 실행의 79-81%에서 이러한 해결책을 발견했지만, 실제로 이를 상호작용하거나 활용(exploit)한 경우는 37-50%에 불과했습니다. 이 격차는 AppWorld에서 가장 두드러졌습니다: 에이전트는 시도의 90% 이상에서 특정 명령이 "이 과제에 대한 완전한 해결책을 반환한다"는 문서를 확인했지만, 이를 활용한 시도는 7% 미만이었습니다. 우리는 에이전트가 '환경적 호기심(environmental curiosity)'이라 부르는 능력, 즉 환경적 자극에 반응하여 예상치 못하지만 관련성 있는 관찰을 인지하고 탐구하는 능력이 부족함을 보여줍니다. 환경적 호기심에 영향을 미치는 세 가지 주요 요인을 확인했습니다: 에이전트 스캐폴드 내에서 사용 가능한 도구, 테스트 시점 연산 자원(test-time compute), 그리고 훈련 데이터 분포입니다. 우리의 연구 결과는 호기심을 최대화하는 구성이 수정되지 않은 원본 벤치마크에서도 최고의 성능을 달성함을 보여줍니다. 그러나 공동으로 최적화된 에이전트라 할지라도 발견한 해결책을 대부분의 시도에서 여전히 무시합니다: 현재의 에이전트는 환경을 예상된 정보를 얻는 데는 사용하지만, 자신의 전략을 수정하거나 유용한 자극을 최대한 활용하는 데는 사용하지 않습니다.

English

LLM-based agents are assumed to integrate environmental observations into their reasoning: discovering highly relevant but unexpected information should naturally lead to a model exploiting its own discoveries. We show that this assumption is false for current LLM-based agents, which struggle to reflect or react to unexpected information. Across three benchmarks (Terminal-Bench, SWE-Bench, AppWorld), we inject complete task solutions into the agent environments to deliberately expose a task's solution to a model. While agents discover these solutions on Terminal-Bench in 79-81% of runs, they interact, or exploit, them in only 37-50% of cases. This gap is starkest in AppWorld: agents see documentation stating that a command "returns the complete solution to this task" in over 90% of attempts but exploit this in fewer than 7% of trials. We show that agents lack what we call environmental curiosity: the capability to recognize and investigate unexpected but relevant observations in response to environmental stimuli. We identify three main factors influencing environmental curiosity: available tools in the agent scaffold, test-time compute, and training data distribution. Our findings identify configurations that maximize curiosity also achieve the best performance on the unmodified benchmarks. Yet even jointly optimized agents still ignore discovered solutions in the majority of trials: current agents use the environment to fetch expected information, but not to revise their strategy or maximally exploit useful stimuli.

에이전트는 탐험하지만 환경을 무시한다: LLM의 환경 호기심 부재

Agents Explore but Agents Ignore: LLMs Lack Environmental Curiosity

초록

Support