ROSBag MCP 서버: 에이전트형 구체화 AI 애플리케이션을 위한 LLM 기반 로봇 데이터 분석

초록

에이전트형 AI 시스템과 물리적 또는 구체화된 AI 시스템은 인공지능과 로보틱스 분야에서 두 가지 주요 연구 축으로 자리 잡고 있으며, 모델 컨텍스트 프로토콜(MCP)은 점차 에이전트형 애플리케이션의 핵심 구성 요소 및 촉진제로 부상하고 있다. 그러나 이러한 연구 축의 교차점, 즉 에이전트형 구체화된 AI에 대한 문헌은 여전히 부족한 실정이다. 본 논문은 ROS 및 ROS 2 백을 분석하기 위한 MCP 서버를 소개하며, 이를 통해 대형 언어 모델(LLM)과 시각적 언어 모델(VLM)을 이용해 로봇 데이터를 자연어로 분석, 시각화 및 처리할 수 있도록 한다. 우리는 로보틱스 도메인 지식을 기반으로 구축된 특정 도구를 설명하며, 초기 릴리스에서는 모바일 로보틱스에 초점을 맞추고 궤적, 레이저 스캔 데이터, 변환, 시계열 데이터 등의 분석을 기본적으로 지원한다. 또한 표준 ROS 2 CLI 도구("ros2 bag list" 또는 "ros2 bag info")에 대한 인터페이스를 제공하고, 특정 주제의 하위 집합이나 시간적으로 잘린 백을 필터링할 수 있는 기능도 제공한다. MCP 서버와 함께, 우리는 경량 UI를 제공하여 Anthropic, OpenAI와 같은 독점 LLM과 Groq를 통한 오픈소스 LLM을 포함한 다양한 LLM을 통해 도구의 성능을 벤치마킹할 수 있도록 한다. 실험 결과에는 독점 및 오픈소스, 대형 및 소형의 8가지 최신 LLM/VLM 모델의 도구 호출 기능 분석이 포함된다. 실험 결과, 도구 호출 기능에 있어 큰 격차가 존재하며, Kimi K2와 Claude Sonnet 4가 뚜렷하게 우수한 성능을 보였다. 또한 도구 설명 스키마, 인수의 수, 모델이 사용할 수 있는 도구의 수 등이 성공률에 영향을 미치는 여러 요인임을 결론지었다. 코드는 허가형 라이선스로 https://github.com/binabik-ai/mcp-rosbags에서 이용할 수 있다.

English

Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increasingly becoming a key component and enabler of agentic applications. However, the literature at the intersection of these verticals, i.e., Agentic Embodied AI, remains scarce. This paper introduces an MCP server for analyzing ROS and ROS 2 bags, allowing for analyzing, visualizing and processing robot data with natural language through LLMs and VLMs. We describe specific tooling built with robotics domain knowledge, with our initial release focused on mobile robotics and supporting natively the analysis of trajectories, laser scan data, transforms, or time series data. This is in addition to providing an interface to standard ROS 2 CLI tools ("ros2 bag list" or "ros2 bag info"), as well as the ability to filter bags with a subset of topics or trimmed in time. Coupled with the MCP server, we provide a lightweight UI that allows the benchmarking of the tooling with different LLMs, both proprietary (Anthropic, OpenAI) and open-source (through Groq). Our experimental results include the analysis of tool calling capabilities of eight different state-of-the-art LLM/VLM models, both proprietary and open-source, large and small. Our experiments indicate that there is a large divide in tool calling capabilities, with Kimi K2 and Claude Sonnet 4 demonstrating clearly superior performance. We also conclude that there are multiple factors affecting the success rates, from the tool description schema to the number of arguments, as well as the number of tools available to the models. The code is available with a permissive license at https://github.com/binabik-ai/mcp-rosbags.

ROSBag MCP 서버: 에이전트형 구체화 AI 애플리케이션을 위한 LLM 기반 로봇 데이터 분석

ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications

초록

Support