ChatPaper.aiChatPaper

MeshFleet:面向领域特定生成建模的过滤与标注3D车辆数据集

MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling

March 18, 2025
作者: Damian Boborzi, Phillip Mueller, Jonas Emrich, Dominik Schmid, Sebastian Mueller, Lars Mikelsons
cs.AI

摘要

生成模型在三维物体领域近期取得了显著进展。然而,由于无法满足特定领域任务所需的精确度、质量和可控性,这些模型在工程等领域的实际应用仍受到限制。对大型生成模型进行微调,是使其在这些领域得以应用的一个前景广阔的方向。创建高质量、领域特定的三维数据集对于微调大型生成模型至关重要,但数据筛选和标注过程仍是一个主要瓶颈。我们推出了MeshFleet,这是一个从Objaverse-XL(目前最广泛的公开三维物体集合)中提取并经过筛选和标注的三维车辆数据集。我们的方法提出了一种基于质量分类器的自动化数据筛选流程。该分类器在Objaverse的手动标注子集上训练,结合了DINOv2和SigLIP嵌入,并通过基于标题的分析和不确定性估计进行优化。我们通过与基于标题和图像美学评分的筛选技术进行对比分析,以及使用SV3D进行的微调实验,展示了我们筛选方法的有效性,强调了针对性数据选择对于领域特定三维生成建模的重要性。
English
Generative models have recently made remarkable progress in the field of 3D objects. However, their practical application in fields like engineering remains limited since they fail to deliver the accuracy, quality, and controllability needed for domain-specific tasks. Fine-tuning large generative models is a promising perspective for making these models available in these fields. Creating high-quality, domain-specific 3D datasets is crucial for fine-tuning large generative models, yet the data filtering and annotation process remains a significant bottleneck. We present MeshFleet, a filtered and annotated 3D vehicle dataset extracted from Objaverse-XL, the most extensive publicly available collection of 3D objects. Our approach proposes a pipeline for automated data filtering based on a quality classifier. This classifier is trained on a manually labeled subset of Objaverse, incorporating DINOv2 and SigLIP embeddings, refined through caption-based analysis and uncertainty estimation. We demonstrate the efficacy of our filtering method through a comparative analysis against caption and image aesthetic score-based techniques and fine-tuning experiments with SV3D, highlighting the importance of targeted data selection for domain-specific 3D generative modeling.

Summary

AI-Generated Summary

PDF32March 19, 2025