ChatPaper.aiChatPaper

Pixie:基于像素的3D物理快速可泛化监督学习

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

August 20, 2025
作者: Long Le, Ryan Lucas, Chen Wang, Chuhao Chen, Dinesh Jayaraman, Eric Eaton, Lingjie Liu
cs.AI

摘要

从视觉信息中推断三维场景的物理属性,对于创建交互式且逼真的虚拟世界而言,是一项至关重要却又极具挑战性的任务。尽管人类能够直观地理解诸如弹性或硬度等材料特性,但现有方法往往依赖于缓慢的逐场景优化,这限制了其通用性和应用范围。为解决这一问题,我们提出了PIXIE,一种创新方法,它通过训练一个可泛化的神经网络,仅利用监督损失从三维视觉特征中预测跨多个场景的物理属性。一旦训练完成,我们的前馈网络便能快速推断出合理的材料场,结合如高斯溅射等学习到的静态场景表示,能够在外部力作用下实现逼真的物理模拟。为支持这一研究,我们还收集了PIXIEVERSE,这是已知最大的配对三维资产与物理材料标注数据集之一。广泛的评估表明,PIXIE在性能上比测试时优化方法高出约1.46至4.39倍,且速度提升了数个数量级。通过利用如CLIP等预训练的视觉特征,我们的方法即便仅在合成数据上训练,也能零样本泛化至真实世界场景。https://pixie-3d.github.io/
English
Inferring the physical properties of 3D scenes from visual information is a critical yet challenging task for creating interactive and realistic virtual worlds. While humans intuitively grasp material characteristics such as elasticity or stiffness, existing methods often rely on slow, per-scene optimization, limiting their generalizability and application. To address this problem, we introduce PIXIE, a novel method that trains a generalizable neural network to predict physical properties across multiple scenes from 3D visual features purely using supervised losses. Once trained, our feed-forward network can perform fast inference of plausible material fields, which coupled with a learned static scene representation like Gaussian Splatting enables realistic physics simulation under external forces. To facilitate this research, we also collected PIXIEVERSE, one of the largest known datasets of paired 3D assets and physic material annotations. Extensive evaluations demonstrate that PIXIE is about 1.46-4.39x better and orders of magnitude faster than test-time optimization methods. By leveraging pretrained visual features like CLIP, our method can also zero-shot generalize to real-world scenes despite only ever been trained on synthetic data. https://pixie-3d.github.io/
PDF91August 27, 2025