물리적 세계에서 예측된 동역학이 존재할 수 있는가?

초록

예측 물리 AI 시스템은 상태 롤아웃(state rollout), 액션 청크(action chunk), 잠재 계획(latent plan)을 출력하지만, 낮은 평균 제곱근 오차(RMSE)가 특정 제안이 물리적으로 실행 가능함을 의미하지는 않는다. 우리는 물리적 허용 가능성(physical admissibility)을 예측-제어 인터페이스로 정식화한다: 실행 전에, 디코딩된 제안은 후보 동역학(candidate dynamics)으로 취급되며 운동학적(kinematic), 동역학적(dynamic), 그리고 직접-구성된 수평선(direct-to-composed horizon) 조건을 사용하여 평가된다. 통과는 작업 성공의 인증서가 아니다; 거부는 지정된 물리적 범위(physical envelope)의 위반을 식별하고 구성 요소 수준의 이유를 제공한다. Hugging Face LeRobot PushT에서 통제된 반증(controlled falsification)은 단일 단계 예측-RMSE와 표준화된 동역학 잔차가 수신자 조작 특성 곡선 아래 면적(AUC) 0.982 및 0.972에 도달하고, 운동학적 조건만으로는 AUC 0.592에 도달하며, 전체 게이트는 조건 수준 속성(condition-level attribution)과 함께 AUC 0.957에 도달함을 보여준다. 재생 기반 개입 실험(replay-based intervention experiments)에서, 잔차 기반 필터(residual-based filters)와 전체 물리적 허용 가능성 게이트는 평균 진행도를 0.998 근처로 유지하면서 87-89%의 유효하지 않은 제안을 방지한다.

English

Predictive Physical AI systems output state rollouts, action chunks, and latent plans, yet a low root-mean-square error (RMSE) does not imply that a particular proposal is physically executable. We formulate physical admissibility as a prediction-control interface: before execution, a decoded proposal is treated as candidate dynamics and evaluated using kinematic, dynamic, and direct-to-composed horizon conditions. Passing is not a certificate of task success; rejection identifies violation of the specified physical envelope and gives a component-level reason. On Hugging Face LeRobot PushT, controlled falsification shows that one-step prediction-RMSE and standardized dynamics residuals reach area under the receiver operating characteristic curve (AUC) 0.982 and 0.972, kinematic-only conditions reach AUC 0.592, and the full gate reaches AUC 0.957 with condition-level attribution. In replay-based intervention experiments, residual-based filters and the full physical-admissibility gate prevent 87-$89% of invalid proposals while preserving mean progress near 0.998.