ROS-LLM: 작업 피드백과 구조화된 추론을 통한 구현형 AI를 위한 ROS 프레임워크

초록

비전문가도 직관적으로 로봇을 프로그래밍할 수 있는 프레임워크를 제안합니다. 이 프레임워크는 자연어 프롬프트와 로봇 운영 체제(ROS)의 상황 정보를 활용합니다. 우리의 시스템은 대규모 언어 모델(LLM)을 통합하여 비전문가가 채팅 인터페이스를 통해 시스템에 작업 요구사항을 명시할 수 있도록 합니다. 이 프레임워크의 주요 특징은 다음과 같습니다: 다양한 오픈소스 및 상용 LLM에 연결된 AI 에이전트와 ROS의 통합, LLM 출력에서 행동을 자동으로 추출하고 ROS 액션/서비스를 실행하는 기능, 세 가지 행동 모드(시퀀스, 행동 트리, 상태 머신) 지원, 가능한 액션 라이브러리에 새로운 로봇 액션을 추가하기 위한 모방 학습, 그리고 인간 및 환경 피드백을 통한 LLM 반영. 다양한 시나리오(장기 작업, 테이블탑 재배치, 원격 감독 제어 등)에서의 광범위한 실험을 통해 이 프레임워크의 견고성, 확장성, 다용도성을 입증했습니다. 우리의 프레임워크 채택과 결과 재현을 지원하기 위해 코드를 오픈소스로 공개했습니다. 다음 링크에서 확인할 수 있습니다: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

English

We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

ROS-LLM: 작업 피드백과 구조화된 추론을 통한 구현형 AI를 위한 ROS 프레임워크

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

초록

Support