言語モデルを用いたインタラクティブタスクプランニング

要旨

インタラクティブなロボットフレームワークは、長期的なタスクプランニングを実現し、実行中であっても新しい目標や異なるタスクに容易に一般化することができます。しかし、従来の手法の多くは事前に定義されたモジュール設計を必要とするため、異なる目標に一般化することが困難です。最近の大規模言語モデルを基にしたアプローチでは、よりオープンエンドなプランニングが可能ですが、多くの場合、重いプロンプトエンジニアリングやドメイン固有の事前学習済みモデルを必要とします。この問題に対処するため、我々は言語モデルを用いたインタラクティブなタスクプランニングを実現するシンプルなフレームワークを提案します。我々のシステムは、高レベルのプランニングと低レベルの関数実行の両方を言語を通じて統合しています。本システムが、未見の目標に対する新しい高レベルの指示を生成する堅牢性と、複雑なプロンプトエンジニアリングを必要とせずにタスクガイドラインを置き換えるだけで異なるタスクに適応する容易さを検証しました。さらに、ユーザーが新しいリクエストを送信した場合、本システムは新しいリクエスト、タスクガイドライン、および以前に実行されたステップに基づいて正確に再プランニングを行うことができます。詳細は、https://wuphilipp.github.io/itp_site および https://youtu.be/TrKLuyv26_g をご覧ください。

English

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our https://wuphilipp.github.io/itp_site and https://youtu.be/TrKLuyv26_g.

言語モデルを用いたインタラクティブタスクプランニング

Interactive Task Planning with Language Models

要旨

Support