AimBot：一種簡易的視覺輔助提示，用於增強視覺運動策略的空間感知能力

摘要

本文提出了一種名為AimBot的輕量級視覺增強技術，該技術通過提供明確的空間線索來改善機器人操作中的視覺運動策略學習。AimBot在多重視角RGB圖像上疊加射擊線和瞄準鏡十字線，提供輔助視覺引導，這些引導編碼了末端執行器的狀態。這些疊加圖像由深度圖像、相機外參以及當前末端執行器姿態計算得出，明確傳達了夾爪與場景中物體之間的空間關係。AimBot僅帶來極小的計算開銷（少於1毫秒），且無需改變模型架構，因為它僅需將原始RGB圖像替換為增強後的版本。儘管方法簡單，我們的結果顯示，AimBot在模擬和現實環境中均能持續提升多種視覺運動策略的性能，凸顯了基於空間的視覺反饋的優勢。

English

In this paper, we propose AimBot, a lightweight visual augmentation technique that provides explicit spatial cues to improve visuomotor policy learning in robotic manipulation. AimBot overlays shooting lines and scope reticles onto multi-view RGB images, offering auxiliary visual guidance that encodes the end-effector's state. The overlays are computed from depth images, camera extrinsics, and the current end-effector pose, explicitly conveying spatial relationships between the gripper and objects in the scene. AimBot incurs minimal computational overhead (less than 1 ms) and requires no changes to model architectures, as it simply replaces original RGB images with augmented counterparts. Despite its simplicity, our results show that AimBot consistently improves the performance of various visuomotor policies in both simulation and real-world settings, highlighting the benefits of spatially grounded visual feedback.

AimBot：一種簡易的視覺輔助提示，用於增強視覺運動策略的空間感知能力

AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies

摘要

Support