MEMFOF：メモリ効率の良いマルチフレームオプティカルフロー推定のための高解像度トレーニング

要旨

近年のオプティカルフロー推定の進展は、特に高解像度（FullHD）入力において、GPUメモリ消費量の増大を代償に精度を優先してきた。本論文では、メモリ効率の良いマルチフレームオプティカルフロー手法であるMEMFOFを提案する。この手法は、マルチフレーム推定とGPUメモリ使用量の間で最適なトレードオフを見出している。特に、MEMFOFは1080p入力において実行時にわずか2.09GBのGPUメモリを必要とし、トレーニング時でも28.5GBしか使用しない。これにより、クロッピングやダウンサンプリングを必要とせずに、ネイティブの1080p解像度でトレーニングできる唯一の手法となっている。RAFT風アーキテクチャの設計選択を体系的に見直し、縮小された相関ボリュームと高解像度トレーニングプロトコルをマルチフレーム推定と統合することで、メモリオーバーヘッドを大幅に削減しつつ、複数のベンチマークで最先端の性能を達成した。本手法は、よりリソース集約的な代替手法よりも精度と実行効率の両面で優れており、高解像度でのフロー推定における堅牢性を実証している。投稿時点で、本手法はSpringベンチマークで1ピクセル（1px）外れ値率3.289で1位、Sintel（clean）でエンドポイントエラー（EPE）0.963で首位、KITTI-2015ではFl-allエラー2.94%で最高性能を達成している。コードはhttps://github.com/msu-video-group/memfofで公開されている。

English

Recent advances in optical flow estimation have prioritized accuracy at the cost of growing GPU memory consumption, particularly for high-resolution (FullHD) inputs. We introduce MEMFOF, a memory-efficient multi-frame optical flow method that identifies a favorable trade-off between multi-frame estimation and GPU memory usage. Notably, MEMFOF requires only 2.09 GB of GPU memory at runtime for 1080p inputs, and 28.5 GB during training, which uniquely positions our method to be trained at native 1080p without the need for cropping or downsampling. We systematically revisit design choices from RAFT-like architectures, integrating reduced correlation volumes and high-resolution training protocols alongside multi-frame estimation, to achieve state-of-the-art performance across multiple benchmarks while substantially reducing memory overhead. Our method outperforms more resource-intensive alternatives in both accuracy and runtime efficiency, validating its robustness for flow estimation at high resolutions. At the time of submission, our method ranks first on the Spring benchmark with a 1-pixel (1px) outlier rate of 3.289, leads Sintel (clean) with an endpoint error (EPE) of 0.963, and achieves the best Fl-all error on KITTI-2015 at 2.94%. The code is available at https://github.com/msu-video-group/memfof.

MEMFOF：メモリ効率の良いマルチフレームオプティカルフロー推定のための高解像度トレーニング

MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation

要旨

Support