極簡視覺慣性里程計

摘要

视觉惯性里程计（VIO）对移动机器人导航至关重要，它依赖搭载大量像素的摄像头进行工作。然而，捕捉和处理相机图像需要大量计算资源。本研究提出了一种面向平面里程计的极简方案，证明仅凭四个视觉测量值和一个惯性测量单元（IMU）即可为差分驱动机器人提供稳健的运动估计。我们的核心洞见在于：四个朝下的光电二极管通过光学Gabor掩膜感知外部环境，产生的信号可直接编码速度信息。基于这一原理，我们利用物理驱动仿真器联合优化了掩膜参数与时间卷积网络（TCN）。最终模型仅需解码光电二极管产生的四个测量值即可获得速度信息，将这些速度估计值与IMU的角速度数据结合，便能生成连续的平面运动轨迹。我们在搭载原型传感器的差分驱动机器人上验证了这一方法。在多种室内外地形测试中，该系统无需任何真实场景微调即可紧密追踪参考真值。本研究证明，极简感知方案能够实现高效且精准的平面里程计。

English

Visual-Inertial Odometry(VIO), which is critical to mobile robot navigation, uses cameras with a large number of pixels. Capturing and processing camera images requires significant resources. This work presents a minimalist approach to planar odometry, demonstrating that just four visual measurements and an IMU can provide robust motion estimation for differential-drive robots. Our key insight is that four downward-facing photodiodes that sense the world through optical Gabor masks produce signals that encode speed. Based on this, we jointly optimize the mask parameters alongside a Temporal Convolutional Network (TCN) using a physically-grounded simulator. The resulting model decodes speed from just the four measurements produced by the photodiodes. Pairing these estimates with the angular speed from an IMU yields a continuous planar trajectory. We validate our approach with a prototype sensor mounted on a differential drive robot. Across diverse indoor and outdoor terrains, our system closely tracks the reference ground truth without any real-world fine-tuning. Our work shows that minimalist sensing enables efficient and accurate planar odometry.