ChatPaper.aiChatPaper

大型語言模型思考得太快,無法有效地進行探索。

Large Language Models Think Too Fast To Explore Effectively

January 29, 2025
作者: Lan Pan, Hanbo Xie, Robert C. Wilson
cs.AI

摘要

大型語言模型展現了許多智能能力。儘管有許多基準評估它們的智能,但對於它們的探索能力卻給予有限的關注,而這是在自然和人工系統中發現新信息並適應新環境的重要能力。LLM在開放式任務中能否有效地進行探索,尤其是在開放式任務中,仍然不清楚。本研究探討了LLM在開放式任務中是否能超越人類在探索方面的能力,使用Little Alchemy 2作為範例,其中代理人結合元素以發現新元素。結果顯示,除了o1模型外,大多數LLM表現不及人類,這些傳統LLM主要依賴於不確定性驅動的策略,不像人類那樣平衡不確定性和賦權。通過對具有稀疏自編碼器的模型進行表徵分析,發現不確定性和選擇是在較早的變壓器塊中表示的,而賦權值則在後期處理,導致LLM思考過快並做出過早的決定,阻礙了有效的探索。這些發現揭示了LLM探索的限制,並提出了改善它們適應性的方向。
English
Large Language Models have emerged many intellectual capacities. While numerous benchmarks assess their intelligence, limited attention has been given to their ability to explore, an essential capacity for discovering new information and adapting to novel environments in both natural and artificial systems. The extent to which LLMs can effectively explore, particularly in open-ended tasks, remains unclear. This study investigates whether LLMs can surpass humans in exploration during an open-ended task, using Little Alchemy 2 as a paradigm, where agents combine elements to discover new ones. Results show most LLMs underperform compared to humans, except for the o1 model, with those traditional LLMs relying primarily on uncertainty driven strategies, unlike humans who balance uncertainty and empowerment. Representational analysis of the models with Sparse Autoencoders revealed that uncertainty and choices are represented at earlier transformer blocks, while empowerment values are processed later, causing LLMs to think too fast and make premature decisions, hindering effective exploration. These findings shed light on the limitations of LLM exploration and suggest directions for improving their adaptability.

Summary

AI-Generated Summary

PDF243January 31, 2025