3D竞技场:生成式3D评估的开放平台
3D Arena: An Open Platform for Generative 3D Evaluation
June 23, 2025
作者: Dylan Ebert
cs.AI
摘要
评估生成式3D模型仍面临挑战,主要源于自动化指标与人类对质量感知之间的不一致。现有基准测试依赖于忽视3D结构的图像指标或无法捕捉感知吸引力和实际应用价值的几何度量。为填补这一空白,我们推出了3D Arena,一个开放平台,通过大规模收集人类偏好,采用成对比较的方式,评估图像到3D生成模型。
自2024年6月上线以来,该平台已从8,096名用户中收集了123,243票,覆盖19个最先进的模型,建立了生成式3D领域最大规模的人类偏好评估。我们贡献了包含100个评估提示的iso3d数据集,并通过统计欺诈检测实现了99.75%的用户真实性控制。基于ELO的排名系统提供了可靠的模型评估,使该平台成为公认的评估资源。
通过分析这些偏好数据,我们揭示了人类偏好的模式。研究发现,视觉呈现特征受到青睐,高斯溅射输出相较于网格模型获得了16.6的ELO优势,而有纹理模型相较于无纹理模型则获得了144.1的ELO优势。我们提出了改进评估方法的建议,包括多标准评估、任务导向评估及格式感知比较。平台的社区参与度确立了3D Arena作为该领域基准的地位,同时推动了生成式3D中以人为中心评估的理解。
English
Evaluating Generative 3D models remains challenging due to misalignment
between automated metrics and human perception of quality. Current benchmarks
rely on image-based metrics that ignore 3D structure or geometric measures that
fail to capture perceptual appeal and real-world utility. To address this gap,
we present 3D Arena, an open platform for evaluating image-to-3D generation
models through large-scale human preference collection using pairwise
comparisons.
Since launching in June 2024, the platform has collected 123,243 votes from
8,096 users across 19 state-of-the-art models, establishing the largest human
preference evaluation for Generative 3D. We contribute the iso3d dataset of 100
evaluation prompts and demonstrate quality control achieving 99.75% user
authenticity through statistical fraud detection. Our ELO-based ranking system
provides reliable model assessment, with the platform becoming an established
evaluation resource.
Through analysis of this preference data, we present insights into human
preference patterns. Our findings reveal preferences for visual presentation
features, with Gaussian splat outputs achieving a 16.6 ELO advantage over
meshes and textured models receiving a 144.1 ELO advantage over untextured
models. We provide recommendations for improving evaluation methods, including
multi-criteria assessment, task-oriented evaluation, and format-aware
comparison. The platform's community engagement establishes 3D Arena as a
benchmark for the field while advancing understanding of human-centered
evaluation in Generative 3D.