PE3R: 知覚効率型3D再構築

要旨

近年の2Dから3Dへの知覚技術の進展により、2D画像からの3Dシーン理解が大幅に向上しています。しかし、既存の手法は、シーン間での汎化性能の限界、知覚精度の低さ、再構築速度の遅さといった重大な課題に直面しています。これらの制約を解決するため、我々は精度と効率の両方を向上させる新しいフレームワークであるPerception-Efficient 3D Reconstruction (PE3R)を提案します。PE3Rは、迅速な3Dセマンティックフィールド再構築を可能にするフィードフォワードアーキテクチャを採用しています。このフレームワークは、多様なシーンやオブジェクトに対して強力なゼロショット汎化性能を示し、再構築速度を大幅に改善します。2Dから3Dへのオープン語彙セグメンテーションおよび3D再構築に関する広範な実験により、PE3Rの有効性と汎用性が検証されました。このフレームワークは、3Dセマンティックフィールド再構築において最低9倍の高速化を達成し、知覚精度と再構築精度の大幅な向上をもたらし、この分野で新たなベンチマークを設定しています。コードは以下のURLで公開されています: https://github.com/hujiecpp/PE3R。

English

Recent advancements in 2D-to-3D perception have significantly improved the understanding of 3D scenes from 2D images. However, existing methods face critical challenges, including limited generalization across scenes, suboptimal perception accuracy, and slow reconstruction speeds. To address these limitations, we propose Perception-Efficient 3D Reconstruction (PE3R), a novel framework designed to enhance both accuracy and efficiency. PE3R employs a feed-forward architecture to enable rapid 3D semantic field reconstruction. The framework demonstrates robust zero-shot generalization across diverse scenes and objects while significantly improving reconstruction speed. Extensive experiments on 2D-to-3D open-vocabulary segmentation and 3D reconstruction validate the effectiveness and versatility of PE3R. The framework achieves a minimum 9-fold speedup in 3D semantic field reconstruction, along with substantial gains in perception accuracy and reconstruction precision, setting new benchmarks in the field. The code is publicly available at: https://github.com/hujiecpp/PE3R.

PE3R: 知覚効率型3D再構築

PE3R: Perception-Efficient 3D Reconstruction

要旨

Support