要求幫助的機器人:大型語言模型規劃者的不確定性對齊
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
July 4, 2023
作者: Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar
cs.AI
摘要
大型語言模型(LLMs)展示了廣泛的應用潛力,從逐步規劃到常識推理,這些能力可能對機器人提供幫助,但容易產生自信的幻覺預測。在這項工作中,我們提出了一個名為KnowNo的框架,用於測量和調整基於LLM的規劃器的不確定性,使它們知道自己不知道並在需要時尋求幫助。KnowNo基於符合性預測理論,提供任務完成的統計保證,同時在複雜的多步規劃環境中最大程度地減少人類幫助。在涉及具有不同模糊模式的任務的各種模擬和真實機器人設置的實驗中(例如,從空間到數值不確定性,從人類偏好到Winograd模式),顯示KnowNo在提高效率和自主性方面優於現代基準線(可能涉及集成或廣泛提示調整),同時提供正式保證。KnowNo可以直接與LLMs一起使用,無需模型微調,並提出了一種有前景的輕量級不確定性建模方法,可以與基礎模型不斷增強的能力相互補充和擴展。網站:https://robot-help.github.io
English
Large language models (LLMs) exhibit a wide range of promising capabilities
-- from step-by-step planning to commonsense reasoning -- that may provide
utility for robots, but remain prone to confidently hallucinated predictions.
In this work, we present KnowNo, which is a framework for measuring and
aligning the uncertainty of LLM-based planners such that they know when they
don't know and ask for help when needed. KnowNo builds on the theory of
conformal prediction to provide statistical guarantees on task completion while
minimizing human help in complex multi-step planning settings. Experiments
across a variety of simulated and real robot setups that involve tasks with
different modes of ambiguity (e.g., from spatial to numeric uncertainties, from
human preferences to Winograd schemas) show that KnowNo performs favorably over
modern baselines (which may involve ensembles or extensive prompt tuning) in
terms of improving efficiency and autonomy, while providing formal assurances.
KnowNo can be used with LLMs out of the box without model-finetuning, and
suggests a promising lightweight approach to modeling uncertainty that can
complement and scale with the growing capabilities of foundation models.
Website: https://robot-help.github.io