CLEA: 동적 환경에서의 작업 수행 향상을 위한 폐쇄 루프 구현 에이전트

초록

대규모 언어 모델(LLMs)은 의미론적 추론을 통해 복잡한 작업의 계층적 분해에서 뛰어난 능력을 보여줍니다. 그러나 이러한 모델을 구체화된 시스템에 적용할 때는 하위 작업 시퀀스의 신뢰할 수 있는 실행과 장기 작업 완료에서의 원샷 성공을 보장하는 데 어려움이 있습니다. 이러한 동적 환경에서의 한계를 해결하기 위해, 우리는 폐루프 임베디드 에이전트(CLEA)를 제안합니다. CLEA는 기능적 분리를 통해 4개의 특화된 오픈소스 LLM을 통합한 새로운 아키텍처로, 폐루프 작업 관리를 가능하게 합니다. 이 프레임워크는 두 가지 핵심 혁신을 특징으로 합니다: (1) 환경 메모리를 기반으로 실행 가능한 하위 작업을 동적으로 생성하는 인터랙티브 작업 플래너, 그리고 (2) 행동 실행 가능성에 대한 확률적 평가를 수행하고 환경적 변화가 사전 설정된 임계값을 초과할 경우 계층적 재계획 메커니즘을 트리거하는 멀티모달 실행 비평가. CLEA의 효과를 검증하기 위해, 우리는 조작 가능한 물체가 있는 실제 환경에서 두 가지 이종 로봇을 사용하여 물체 탐색, 조작, 그리고 탐색-조작 통합 작업을 실험했습니다. 12개의 작업 시도에서 CLEA는 기준 모델을 능가하며, 성공률에서 67.3%의 향상과 작업 완료율에서 52.8%의 증가를 달성했습니다. 이러한 결과는 CLEA가 동적 환경에서 작업 계획과 실행의 견고성을 크게 향상시킨다는 것을 보여줍니다.

English

Large Language Models (LLMs) exhibit remarkable capabilities in the hierarchical decomposition of complex tasks through semantic reasoning. However, their application in embodied systems faces challenges in ensuring reliable execution of subtask sequences and achieving one-shot success in long-term task completion. To address these limitations in dynamic environments, we propose Closed-Loop Embodied Agent (CLEA) -- a novel architecture incorporating four specialized open-source LLMs with functional decoupling for closed-loop task management. The framework features two core innovations: (1) Interactive task planner that dynamically generates executable subtasks based on the environmental memory, and (2) Multimodal execution critic employing an evaluation framework to conduct a probabilistic assessment of action feasibility, triggering hierarchical re-planning mechanisms when environmental perturbations exceed preset thresholds. To validate CLEA's effectiveness, we conduct experiments in a real environment with manipulable objects, using two heterogeneous robots for object search, manipulation, and search-manipulation integration tasks. Across 12 task trials, CLEA outperforms the baseline model, achieving a 67.3% improvement in success rate and a 52.8% increase in task completion rate. These results demonstrate that CLEA significantly enhances the robustness of task planning and execution in dynamic environments.

CLEA: 동적 환경에서의 작업 수행 향상을 위한 폐쇄 루프 구현 에이전트

CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments

초록

Support