ChatPaper.aiChatPaper

LoGeR:基於混合記憶的長上下文幾何重建

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

March 3, 2026
作者: Junyi Zhang, Charles Herrmann, Junhwa Hur, Chen Sun, Ming-Hsuan Yang, Forrester Cole, Trevor Darrell, Deqing Sun
cs.AI

摘要

前饋式幾何基礎模型在短時序視窗重建方面表現優異,但將其擴展至分鐘級長視頻時,會受制於二次方注意力複雜度或循環設計中有限的記憶體效率。我們提出LoGeR(長時序幾何重建)架構,該創新模型無需後續優化即可實現極長序列的稠密三維重建。LoGeR採用分塊處理視頻流,利用強雙向先驗進行高保真度的塊內推理。為解決跨塊連貫性這一關鍵挑戰,我們設計了基於學習的混合記憶模組:該雙組件系統結合參數化測試時訓練記憶體來錨定全局座標系並防止尺度漂移,同時配備非參數化滑動視窗注意力機制以保留未壓縮上下文,實現高精度相鄰對齊。值得注意的是,此記憶架構使LoGeR僅需在128幀序列上訓練,即可在推理時泛化至數千幀長度。在標準基準測試及我們重新構建的VBR數據集(包含最長達1.9萬幀序列)上的評估顯示,LoGeR顯著超越現有頂級前饋方法——在KITTI數據集上將絕對軌跡誤差降低逾74%——並在空前時長範圍內實現了魯棒的全局一致重建。
English
Feedforward geometric foundation models achieve strong short-window reconstruction, yet scaling them to minutes-long videos is bottlenecked by quadratic attention complexity or limited effective memory in recurrent designs. We present LoGeR (Long-context Geometric Reconstruction), a novel architecture that scales dense 3D reconstruction to extremely long sequences without post-optimization. LoGeR processes video streams in chunks, leveraging strong bidirectional priors for high-fidelity intra-chunk reasoning. To manage the critical challenge of coherence across chunk boundaries, we propose a learning-based hybrid memory module. This dual-component system combines a parametric Test-Time Training (TTT) memory to anchor the global coordinate frame and prevent scale drift, alongside a non-parametric Sliding Window Attention (SWA) mechanism to preserve uncompressed context for high-precision adjacent alignment. Remarkably, this memory architecture enables LoGeR to be trained on sequences of 128 frames, and generalize up to thousands of frames during inference. Evaluated across standard benchmarks and a newly repurposed VBR dataset with sequences of up to 19k frames, LoGeR substantially outperforms prior state-of-the-art feedforward methods--reducing ATE on KITTI by over 74%--and achieves robust, globally consistent reconstruction over unprecedented horizons.
PDF546March 16, 2026