Real-is-Sim: 동적 디지털 트윈을 활용한 시뮬레이션-현실 간격 해소 및 실제 환경 로봇 정책 평가

초록

최근 행동 복제(behavior cloning) 기술의 발전으로 로봇이 복잡한 조작 작업을 수행할 수 있게 되었습니다. 그러나 실제 작업 성공률과 행동 복제 손실 간의 상관관계가 약하기 때문에, 특히 실세계 응용 분야에서 훈련 성능을 정확히 평가하는 것은 여전히 어려운 과제입니다. 이로 인해 연구자들은 비용이 많이 들고 시간이 소요되는 실세계 평가에서 도출된 성공률 지표에 의존하게 되며, 이는 최적의 정책을 식별하거나 과적합(overfitting) 및 과소적합(underfitting)을 탐지하는 것을 비현실적으로 만듭니다. 이러한 문제를 해결하기 위해, 우리는 실세계와 시뮬레이션을 통합한 새로운 행동 복제 프레임워크인 real-is-sim을 제안합니다. 이 프레임워크는 정책 개발 파이프라인(데이터 수집, 훈련, 배포) 전반에 걸쳐 동적 디지털 트윈(dynamic digital twin, Embodied Gaussians 기반)을 통합합니다. 시뮬레이션 세계를 물리적 세계와 지속적으로 정렬함으로써, 실세계에서 시뮬레이터에서 추출된 상태를 기반으로 데모를 수집할 수 있습니다. 시뮬레이터는 임의의 시점에서 이미지 입력을 렌더링하거나 장면 내 객체의 저수준 상태 정보를 추출함으로써 유연한 상태 표현을 가능하게 합니다. 훈련 중에는 정책을 오프라인 및 고도로 병렬화 가능한 방식으로 시뮬레이터 내에서 직접 평가할 수 있습니다. 마지막으로, 배포 단계에서는 정책이 시뮬레이터 내에서 실행되며, 실제 로봇이 시뮬레이션된 로봇의 관절을 직접 추적함으로써 정책 실행을 실제 하드웨어와 효과적으로 분리하고 전통적인 도메인 전환 문제를 완화합니다. 우리는 PushT 조작 작업에서 real-is-sim을 검증하며, 시뮬레이터에서 얻은 성공률과 실세계 평가 간의 강력한 상관관계를 입증했습니다. 우리 시스템의 동영상은 https://realissim.rai-inst.com에서 확인할 수 있습니다.

English

Recent advancements in behavior cloning have enabled robots to perform complex manipulation tasks. However, accurately assessing training performance remains challenging, particularly for real-world applications, as behavior cloning losses often correlate poorly with actual task success. Consequently, researchers resort to success rate metrics derived from costly and time-consuming real-world evaluations, making the identification of optimal policies and detection of overfitting or underfitting impractical. To address these issues, we propose real-is-sim, a novel behavior cloning framework that incorporates a dynamic digital twin (based on Embodied Gaussians) throughout the entire policy development pipeline: data collection, training, and deployment. By continuously aligning the simulated world with the physical world, demonstrations can be collected in the real world with states extracted from the simulator. The simulator enables flexible state representations by rendering image inputs from any viewpoint or extracting low-level state information from objects embodied within the scene. During training, policies can be directly evaluated within the simulator in an offline and highly parallelizable manner. Finally, during deployment, policies are run within the simulator where the real robot directly tracks the simulated robot's joints, effectively decoupling policy execution from real hardware and mitigating traditional domain-transfer challenges. We validate real-is-sim on the PushT manipulation task, demonstrating strong correlation between success rates obtained in the simulator and real-world evaluations. Videos of our system can be found at https://realissim.rai-inst.com.

Real-is-Sim: 동적 디지털 트윈을 활용한 시뮬레이션-현실 간격 해소 및 실제 환경 로봇 정책 평가

Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin for Real-World Robot Policy Evaluation

초록

Support