RemoteZero: 인간 주해 없이 수행하는 지리공간 추론

초록

지공간 추론은 모델이 복잡한 공간 의미와 사용자 의도를 지구 관측을 위한 정확한 대상 위치로 해석해야 하는 과제입니다. 최근 발전으로 추론 경로가 수동 편집에서 벗어나 모델이 자체적인 추론 체인을 생성할 수 있게 되었습니다. 그러나 최종적인 의존성은 여전히 남아있는데, 바로 인간이 주석을 단 실제 좌표 값에 대한 감독입니다. 이로 인해 추론 과정은 자율적이지만 공간적 종착점은 그렇지 못하며, 풍부한 레이블이 없는 원격 감지 데이터에 대한 진정한 자기 진화가 방해받고 있습니다. 이러한 병목 현상을 해결하기 위해 우리는 박스 감독이 없는 지공간 추론 프레임워크인 RemoteZero를 소개합니다. RemoteZero는 단순한 비대칭성에서 출발합니다: MLLM(멀티모달 대형 언어 모델)은 일반적으로 정확한 좌표를 직접 생성하는 것보다 특정 영역이 질의를 만족하는지 검증하는 데 더 뛰어납니다. 이更强的 판별 능력을 활용하여 RemoteZero는 기하학적 감독을 내재적 의미론적 검증으로 대체하고 박스 주석 없이 GRPO 훈련을 가능하게 합니다. 결과적인 프레임워크는 반복적 자기 진화를 추가로 지원하여 모델이 자체 검증 신호를 통해 레이블이 없는 원격 감지 이미지로부터 개선될 수 있도록 합니다. 실험 결과, RemoteZero는 강력한 감독 방식과 비교해 경쟁력 있는 성능을 달성하여 지공간 추론 위치 결정을 위한 자기 검증 훈련의 잠재력을 입증했습니다.

English

Geospatial reasoning requires models to resolve complex spatial semantics and user intent into precise target locations for Earth observation. Recent progress has liberated the reasoning path from manual curation, allowing models to generate their own inference chains. Yet a final dependency remains: they are still supervised by human-annotated ground-truth coordinates. This leaves the reasoning process autonomous, but not its spatial endpoint, and prevents true self-evolution on abundant unlabeled remote sensing data. To break this bottleneck, we introduce RemoteZero, a box-supervision-free framework for geospatial reasoning. RemoteZero is motivated by a simple asymmetry: an MLLM is typically better at verifying whether a region satisfies a query than at directly generating precise coordinates. Leveraging this stronger discriminative ability, RemoteZero replaces geometric supervision with intrinsic semantic verification and enables GRPO training without box annotations. The resulting framework further supports iterative self-evolution, allowing the model to improve from unlabeled remote sensing imagery through its own verification signal. Experiments show that RemoteZero achieves competitive performance against strong supervised methods, demonstrating the potential of self-verifying training for geospatial reasoning localization.

RemoteZero: 인간 주해 없이 수행하는 지리공간 추론

RemoteZero: Geospatial Reasoning with Zero Human Annotations

초록

Support