ChatPaper.aiChatPaper

DL3DV-10K:一個用於基於深度學習的3D視覺的大規模場景數據集

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

December 26, 2023
作者: Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, Xuanmao Li, Xingpeng Sun, Rohan Ashok, Aniruddha Mukherjee, Hao Kang, Xiangrui Kong, Gang Hua, Tianti Zhang, Bedrich Benes, Aniket Bera
cs.AI

摘要

我們目睹了基於深度學習的3D視覺方面取得了顯著進展,從基於神經輻射場(NeRF)的3D表示學習到應用於新視角合成(NVS)。然而,現有用於基於深度學習的3D視覺的場景級數據集,僅限於合成環境或狹窄選擇的現實場景,相當不足。這種不足不僅阻礙了對現有方法的全面評估,還限制了在基於深度學習的3D分析中可以探索的範圍。為了填補這一關鍵差距,我們提出了DL3DV-10K,一個大規模的場景數據集,包括來自65種感興趣點(POI)位置的10,510個視頻中的5120萬幀,涵蓋了有界和無界場景,具有不同水平的反射、透明度和照明。我們在DL3DV-10K上對最近的NVS方法進行了全面評估,揭示了未來NVS研究的寶貴見解。此外,我們在一項初步研究中從DL3DV-10K中學習到了可推廣的NeRF,這顯示了建立通往學習3D表示基礎模型的大規模場景級數據集的必要性。我們的DL3DV-10K數據集、評估結果和模型將在https://dl3dv-10k.github.io/DL3DV-10K/ 上公開提供。
English
We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation. Our DL3DV-10K dataset, benchmark results, and models will be publicly accessible at https://dl3dv-10k.github.io/DL3DV-10K/.
PDF174December 15, 2024