언어 모델을 활용한 인터랙티브 작업 계획

초록

인터랙티브 로봇 프레임워크는 장기적인 작업 계획을 수행하며, 실행 중에도 새로운 목표나 다른 작업으로 쉽게 일반화할 수 있습니다. 그러나 대부분의 전통적인 방법은 사전 정의된 모듈 설계를 필요로 하기 때문에 다양한 목표로 일반화하기 어렵습니다. 최근의 대규모 언어 모델 기반 접근법은 더 개방형 계획을 가능하게 하지만, 종종 복잡한 프롬프트 엔지니어링이나 도메인 특화 사전 학습 모델을 요구합니다. 이를 해결하기 위해, 우리는 언어 모델을 사용한 인터랙티브 작업 계획을 달성하는 간단한 프레임워크를 제안합니다. 우리의 시스템은 언어를 통해 고수준 계획과 저수준 기능 실행을 모두 통합합니다. 우리는 시스템이 보이지 않는 목표에 대한 새로운 고수준 지침을 생성하는 강건성과, 단순히 작업 가이드라인을 교체함으로써 다른 작업에 쉽게 적응할 수 있는 능력을 검증했습니다. 또한, 사용자가 새로운 요청을 보낼 때, 우리의 시스템은 새로운 요청, 작업 가이드라인 및 이전에 실행된 단계를 기반으로 정밀하게 재계획할 수 있습니다. 자세한 내용은 https://wuphilipp.github.io/itp_site와 https://youtu.be/TrKLuyv26_g에서 확인하실 수 있습니다.

English

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our https://wuphilipp.github.io/itp_site and https://youtu.be/TrKLuyv26_g.

언어 모델을 활용한 인터랙티브 작업 계획

Interactive Task Planning with Language Models

초록

Support