MEMFOF:面向内存高效多帧光流估计的高分辨率训练框架
MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation
June 29, 2025
作者: Vladislav Bargatin, Egor Chistov, Alexander Yakovenko, Dmitriy Vatolin
cs.AI
摘要
近期光流估计领域的进展在追求精度的同时,导致了GPU内存消耗的显著增加,尤其是在处理高分辨率(FullHD)输入时。我们提出了MEMFOF,一种内存高效的多帧光流估计方法,它在多帧估计与GPU内存使用之间找到了一个理想的平衡点。值得注意的是,MEMFOF在处理1080p输入时仅需2.09 GB的运行时GPU内存,训练时也仅需28.5 GB,这使得我们的方法能够在不进行裁剪或下采样的条件下,直接在原生1080p分辨率下进行训练。我们系统地重新审视了类似RAFT架构的设计选择,通过整合缩减的相关性体积和高分辨率训练协议,结合多帧估计,在多个基准测试中实现了最先进的性能,同时大幅降低了内存开销。我们的方法在准确性和运行效率上均优于资源消耗更大的替代方案,验证了其在高分辨率光流估计中的鲁棒性。截至提交时,我们的方法在Spring基准测试中以3.289的1像素(1px)异常率位居榜首,在Sintel(clean)上以0.963的端点误差(EPE)领先,并在KITTI-2015上实现了2.94%的最佳Fl-all误差。代码已发布于https://github.com/msu-video-group/memfof。
English
Recent advances in optical flow estimation have prioritized accuracy at the
cost of growing GPU memory consumption, particularly for high-resolution
(FullHD) inputs. We introduce MEMFOF, a memory-efficient multi-frame optical
flow method that identifies a favorable trade-off between multi-frame
estimation and GPU memory usage. Notably, MEMFOF requires only 2.09 GB of GPU
memory at runtime for 1080p inputs, and 28.5 GB during training, which uniquely
positions our method to be trained at native 1080p without the need for
cropping or downsampling. We systematically revisit design choices from
RAFT-like architectures, integrating reduced correlation volumes and
high-resolution training protocols alongside multi-frame estimation, to achieve
state-of-the-art performance across multiple benchmarks while substantially
reducing memory overhead. Our method outperforms more resource-intensive
alternatives in both accuracy and runtime efficiency, validating its robustness
for flow estimation at high resolutions. At the time of submission, our method
ranks first on the Spring benchmark with a 1-pixel (1px) outlier rate of 3.289,
leads Sintel (clean) with an endpoint error (EPE) of 0.963, and achieves the
best Fl-all error on KITTI-2015 at 2.94%. The code is available at
https://github.com/msu-video-group/memfof.