ChatPaper.aiChatPaper

隨機極小極大樹之雙保真最佳動作識別

Two-Fidelity Best-Action Identification for Stochastic Minimax Tree

June 1, 2026
作者: Peter Chen, Xi Chen
cs.AI

摘要

我們研究隨機極小極大樹中固定信心最佳行動識別(BAI)問題。此問題在現代人工智慧規劃中日益重要,因為深度極小極大搜索與結合大型語言模型長展開的蒙地卡羅樹搜索(MCTS)面臨一項基本權衡:啟發式評估成本低但存在偏差,而準確展開雖可靠但成本過高。我們提出 2FFS,一種雙保真度樹搜索演算法,將多保真度平面賭博機概念引入樹結構中。該演算法結合了極小極大風格的快速擴展與 MCTS 風格的隨機取樣,能自適應地決定何時利用廉價但有偏差的評估,以及何時調用昂貴但準確的評估來進行局部驗證。我們證明了固定信心的正確性,建立了精確識別的有限停止性質,並為一般深度樹給出了多項式深度的成本上界。在數值隨機樹實驗中,與現有的 BAI-MCTS 基準方法相比,2FFS 使用了顯著更少的樣本與計算操作。
English
We study fixed-confidence best-action identification (BAI) in stochastic minimax trees. This problem is increasingly relevant in modern AI planning, where deep minimax search and Monte Carlo Tree Search (MCTS) with language model long rollouts face a fundamental tradeoff: heuristic evaluations are cheap but biased, while accurate rollouts are reliable but prohibitively expensive. We propose 2FFS, a two-fidelity tree-search algorithm that brings multi-fidelity flat bandit ideas into trees. The algorithm combines minimax-style fast expansion with MCTS-style stochastic sampling, adaptively deciding when to exploit cheap biased evaluations and when to invoke expensive accurate evaluations for local certification. We prove fixed-confidence correctness, establish finite stopping for exact identification, and give a polynomial-depth cost upper bound for general-depth trees. Across numerical stochastic-tree experiments, 2FFS uses substantially fewer samples and computational operations comparing to existing BAI-MCTS baseline.