DF3DV-1K: 面向无干扰新视角合成的大规模数据集与基准

摘要

辐射场技术的进步推动了照片级真实感的新视角合成。在多个领域中，已有大规模真实世界数据集被开发出来，以支持全面的基准测试并促进超越场景特定重建的进展。然而，对于无干扰辐射场，目前仍缺乏一个包含每场景清晰与杂乱图像的大规模数据集，这限制了相关发展。为填补这一空白，我们提出DF3DV-1K，这是一个大规模真实世界数据集，包含1,048个场景，每个场景均提供清晰和杂乱的图像集以用于基准测试。该数据集总共包含89,924张图像，使用消费级相机模拟随意拍摄方式采集，涵盖128种干扰物类型和161种场景主题，覆盖室内和室外环境。其中精心挑选的41个场景子集DF3DV-41，系统设计用于评估无干扰辐射场方法在具有挑战性场景下的鲁棒性。利用DF3DV-1K，我们对九种最新的无干扰辐射场方法和3D高斯溅射进行了基准测试，识别出最鲁棒的方法和最具挑战性的场景。除基准测试外，我们还展示了DF3DV-1K的一个应用：通过微调基于扩散的2D增强器来改进辐射场方法，在保留集（如DF3DV-41）和On-the-go数据集上实现了平均0.96 dB PSNR和0.057 LPIPS的提升。我们希望DF3DV-1K能促进无干扰视觉的发展，并推动超越场景特定方法的进步。数据集和排行榜可在https://johnnylu305.github.io/df3dv1k_web/获取。

English

Advances in radiance fields have enabled photorealistic novel view synthesis. In several domains, large-scale real-world datasets have been developed to support comprehensive benchmarking and to facilitate progress beyond scene-specific reconstruction. However, for distractor-free radiance fields, a large-scale dataset with clean and cluttered images per scene remains lacking, limiting the development. To address this gap, we introduce DF3DV-1K, a large-scale real-world dataset comprising 1,048 scenes, each providing clean and cluttered image sets for benchmarking. In total, the dataset contains 89,924 images captured using consumer cameras to mimic casual capture, spanning 128 distractor types and 161 scene themes across indoor and outdoor environments. A curated subset of 41 scenes, DF3DV-41, is systematically designed to evaluate the robustness of distractor-free radiance field methods under challenging scenarios. Using DF3DV-1K, we benchmark nine recent distractor-free radiance field methods and 3D Gaussian Splatting, identifying the most robust methods and the most challenging scenarios. Beyond benchmarking, we demonstrate an application of DF3DV-1K by fine-tuning a diffusion-based 2D enhancer to improve radiance field methods, achieving average improvements of 0.96 dB PSNR and 0.057 LPIPS on the held-out set (e.g., DF3DV-41) and the On-the-go dataset. We hope DF3DV-1K facilitates the development of distractor-free vision and promotes progress beyond scene-specific approaches. The dataset and leaderboard are available at https://johnnylu305.github.io/df3dv1k_web/.