大型语言模型的K级推理
K-Level Reasoning with Large Language Models
February 2, 2024
作者: Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Yan Xia, Man Lan, Furu Wei
cs.AI
摘要
尽管大型语言模型(LLMs)已经展示了它们在复杂推理任务中的熟练表现,但它们在动态、互动和竞争场景中的表现——比如商业战略和股市分析——仍未得到充分探讨。为了弥补这一差距,我们正式探讨LLMs在快速演变环境中进行决策的动态推理能力。我们引入了两个基于博弈论的试点挑战,模拟了现实世界动态决策制定的复杂性。这些挑战定义明确,能够清晰、可控和精确地评估LLMs的动态推理能力。通过大量实验,我们发现现有的推理方法在需要k层思考的动态环境中往往表现不佳——这是之前研究未能解决的关键概念。为了解决这一问题,我们提出了一种新颖的LLMs推理方法,名为“K层推理”。该方法采用对手的视角,基于可用的历史信息递归地运用k层思考,显著提高了对手后续动作的预测准确性,并促进更具战略性的决策制定。这项研究不仅为评估动态推理设定了稳健的定量基准,还显著提升了LLMs在动态环境中的熟练程度。
English
While Large Language Models (LLMs) have demonstrated their proficiency in
complex reasoning tasks, their performance in dynamic, interactive, and
competitive scenarios - such as business strategy and stock market analysis -
remains underexplored. To bridge this gap, we formally explore the dynamic
reasoning capabilities of LLMs for decision-making in rapidly evolving
environments. We introduce two game theory-based pilot challenges that mirror
the complexities of real-world dynamic decision-making. These challenges are
well-defined, enabling clear, controllable, and precise evaluation of LLMs'
dynamic reasoning abilities. Through extensive experiments, we find that
existing reasoning methods tend to falter in dynamic settings that require
k-level thinking - a key concept not tackled by previous works. To address
this, we propose a novel reasoning approach for LLMs, named "K-Level
Reasoning". This approach adopts the perspective of rivals to recursively
employ k-level thinking based on available historical information, which
significantly improves the prediction accuracy of rivals' subsequent moves and
informs more strategic decision-making. This research not only sets a robust
quantitative benchmark for the assessment of dynamic reasoning but also
markedly enhances the proficiency of LLMs in dynamic contexts.