이미지를 IMU로 활용: 단일 모션 블러 이미지에서 카메라 운동 추정하기

초록

많은 로보틱스 및 VR/AR 애플리케이션에서 빠른 카메라 움직임은 높은 수준의 모션 블러를 유발하며, 이는 기존의 카메라 포즈 추정 방법들이 실패하게 만듭니다. 본 연구에서는 모션 블러를 원치 않는 아티팩트로 취급하는 대신, 이를 모션 추정을 위한 풍부한 단서로 활용하는 새로운 프레임워크를 제안합니다. 우리의 접근 방식은 단일 모션 블러 이미지로부터 직접 조밀한 모션 흐름 필드와 단안 깊이 맵을 예측하는 방식으로 작동합니다. 그런 다음, 작은 움직임 가정 하에서 선형 최소 제곱 문제를 해결하여 순간 카메라 속도를 복원합니다. 본질적으로, 우리의 방법은 빠르고 격렬한 카메라 움직임을 강력하게 포착하는 IMU와 유사한 측정값을 생성합니다. 모델을 학습시키기 위해, 우리는 ScanNet++v2에서 도출된 현실적인 합성 모션 블러로 구성된 대규모 데이터셋을 구축하고, 완전히 미분 가능한 파이프라인을 사용하여 실제 데이터에 대해 엔드투엔드로 학습함으로써 모델을 추가로 개선합니다. 실제 벤치마크에 대한 광범위한 평가 결과, 우리의 방법이 MASt3R 및 COLMAP과 같은 현재의 방법들을 능가하며, 최신 수준의 각속도 및 병진 속도 추정을 달성함을 보여줍니다.

English

In many robotics and VR/AR applications, fast camera motions cause a high level of motion blur, causing existing camera pose estimation methods to fail. In this work, we propose a novel framework that leverages motion blur as a rich cue for motion estimation rather than treating it as an unwanted artifact. Our approach works by predicting a dense motion flow field and a monocular depth map directly from a single motion-blurred image. We then recover the instantaneous camera velocity by solving a linear least squares problem under the small motion assumption. In essence, our method produces an IMU-like measurement that robustly captures fast and aggressive camera movements. To train our model, we construct a large-scale dataset with realistic synthetic motion blur derived from ScanNet++v2 and further refine our model by training end-to-end on real data using our fully differentiable pipeline. Extensive evaluations on real-world benchmarks demonstrate that our method achieves state-of-the-art angular and translational velocity estimates, outperforming current methods like MASt3R and COLMAP.

이미지를 IMU로 활용: 단일 모션 블러 이미지에서 카메라 운동 추정하기

Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image

초록

Support