ROS-LLM: Un framework ROS per l'IA incarnata con feedback sulle attività e ragionamento strutturato

Abstract

Presentiamo un framework per la programmazione intuitiva di robot da parte di non esperti, sfruttando prompt in linguaggio naturale e informazioni contestuali dal Robot Operating System (ROS). Il nostro sistema integra modelli linguistici di grandi dimensioni (LLM), consentendo ai non esperti di articolare i requisiti delle attività al sistema attraverso un'interfaccia chat. Le caratteristiche principali del framework includono: l'integrazione di ROS con un agente di IA connesso a una vasta gamma di LLM open-source e commerciali, l'estrazione automatica di un comportamento dall'output dell'LLM e l'esecuzione di azioni/servizi ROS, il supporto per tre modalità di comportamento (sequenza, albero comportamentale, macchina a stati), l'apprendimento per imitazione per aggiungere nuove azioni robot alla libreria di azioni possibili e la riflessione dell'LLM tramite feedback umano e ambientale. Esperimenti estensivi convalidano il framework, dimostrando robustezza, scalabilità e versatilità in diversi scenari, tra cui attività a lungo termine, riorganizzazioni su tavolo e controllo supervisionato remoto. Per facilitare l'adozione del nostro framework e supportare la riproduzione dei nostri risultati, abbiamo reso il nostro codice open-source. È possibile accedervi all'indirizzo: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

English

We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

ROS-LLM: Un framework ROS per l'IA incarnata con feedback sulle attività e ragionamento strutturato

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Abstract

Support