助けを求めるロボット：大規模言語モデルプランナーのための不確実性アラインメント

要旨

大規模言語モデル（LLMs）は、段階的な計画立案から常識推論まで、ロボットにとって有用な幅広い能力を示すが、自信を持って誤った予測を生成する傾向がある。本論文では、KnowNoを提案する。これは、LLMベースのプランナーの不確実性を測定し、整合させるためのフレームワークであり、モデルが自身の知識の限界を認識し、必要な時に助けを求めることを可能にする。KnowNoは、コンフォーマル予測の理論に基づいて、複雑な多段階計画設定において人間の助けを最小化しつつ、タスク完了に対する統計的保証を提供する。空間的不確実性から数値的不確実性、人間の選好からウィノグラードスキーマまで、様々な曖昧さを伴うタスクを含むシミュレーションおよび実ロボット環境での実験により、KnowNoは、効率性と自律性の向上において、アンサンブルや大規模なプロンプトチューニングを含む現代的なベースラインを上回り、形式的な保証を提供することが示された。KnowNoは、モデルのファインチューニングなしにLLMをそのまま使用でき、ファウンデーションモデルの能力向上に伴って拡張可能な、軽量な不確実性モデリングアプローチの可能性を示唆している。ウェブサイト: https://robot-help.github.io

English

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions. In this work, we present KnowNo, which is a framework for measuring and aligning the uncertainty of LLM-based planners such that they know when they don't know and ask for help when needed. KnowNo builds on the theory of conformal prediction to provide statistical guarantees on task completion while minimizing human help in complex multi-step planning settings. Experiments across a variety of simulated and real robot setups that involve tasks with different modes of ambiguity (e.g., from spatial to numeric uncertainties, from human preferences to Winograd schemas) show that KnowNo performs favorably over modern baselines (which may involve ensembles or extensive prompt tuning) in terms of improving efficiency and autonomy, while providing formal assurances. KnowNo can be used with LLMs out of the box without model-finetuning, and suggests a promising lightweight approach to modeling uncertainty that can complement and scale with the growing capabilities of foundation models. Website: https://robot-help.github.io

助けを求めるロボット：大規模言語モデルプランナーのための不確実性アラインメント

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

要旨

Support