DynaAct：基于动态行动空间的大语言模型推理

摘要

在现代序列决策系统中，构建最优候选动作空间对高效推理至关重要。然而现有方法要么依赖缺乏可扩展性的人工定义动作空间，要么使用非结构化空间导致穷举搜索在计算上不可行。本文提出名为DynaAct的新型框架，通过自动构建紧凑动作空间来增强复杂问题解决场景中的序列推理能力。我们的方法首先利用大语言模型从涵盖多样化复杂推理问题的语料库中提取通用模式，以此估计完整动作空间的代理表示。随后构建一个子模函数，综合评估候选动作对当前状态的效用性及其多样性，并采用贪心算法选择最优候选集。在六个多样化标准基准上的大量实验表明，本方法在保持高效推理且不引入显著延迟的同时，显著提升了整体性能。代码实现已发布于https://github.com/zhaoxlpku/DynaAct。

English

In modern sequential decision-making systems, the construction of an optimal candidate action space is critical to efficient inference. However, existing approaches either rely on manually defined action spaces that lack scalability or utilize unstructured spaces that render exhaustive search computationally prohibitive. In this paper, we propose a novel framework named DynaAct for automatically constructing a compact action space to enhance sequential reasoning in complex problem-solving scenarios. Our method first estimates a proxy for the complete action space by extracting general sketches observed in a corpus covering diverse complex reasoning problems using large language models. We then formulate a submodular function that jointly evaluates candidate actions based on their utility to the current state and their diversity, and employ a greedy algorithm to select an optimal candidate set. Extensive experiments on six diverse standard benchmarks demonstrate that our approach significantly improves overall performance, while maintaining efficient inference without introducing substantial latency. The implementation is available at https://github.com/zhaoxlpku/DynaAct.

DynaAct：基于动态行动空间的大语言模型推理

DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

摘要

Support