Squeeze3D:你的3D生成模型实则是一个极致的神经压缩器
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor
June 9, 2025
作者: Rishit Dagli, Yushi Guan, Sankeerth Durvasula, Mohammadreza Mofayezi, Nandita Vijaykumar
cs.AI
摘要
我们提出了Squeeze3D,一种创新框架,它利用现有预训练3D生成模型所学习的隐式先验知识,以极高的压缩比压缩3D数据。该方法通过可训练的映射网络,在预训练编码器与预训练生成模型之间架起潜在空间的桥梁。任何以网格、点云或辐射场形式表示的3D模型,首先由预训练编码器编码,随后被转换(即压缩)为高度紧凑的潜在代码。此潜在代码能有效作为网格或点云的极致压缩表示。映射网络将压缩后的潜在代码转换至强大生成模型的潜在空间,进而通过条件化重构原始3D模型(即解压缩)。Squeeze3D完全在生成的合成数据上训练,无需任何3D数据集。Squeeze3D架构可灵活搭配现有的预训练3D编码器与生成模型,支持包括网格、点云及辐射场在内的多种格式。实验表明,Squeeze3D在保持与众多现有方法相当的视觉质量的同时,实现了纹理网格高达2187倍、点云55倍、辐射场619倍的压缩比。由于无需训练针对特定对象的压缩网络,Squeeze3D仅引入极小的压缩与解压缩延迟。
English
We propose Squeeze3D, a novel framework that leverages implicit prior
knowledge learnt by existing pre-trained 3D generative models to compress 3D
data at extremely high compression ratios. Our approach bridges the latent
spaces between a pre-trained encoder and a pre-trained generation model through
trainable mapping networks. Any 3D model represented as a mesh, point cloud, or
a radiance field is first encoded by the pre-trained encoder and then
transformed (i.e. compressed) into a highly compact latent code. This latent
code can effectively be used as an extremely compressed representation of the
mesh or point cloud. A mapping network transforms the compressed latent code
into the latent space of a powerful generative model, which is then conditioned
to recreate the original 3D model (i.e. decompression). Squeeze3D is trained
entirely on generated synthetic data and does not require any 3D datasets. The
Squeeze3D architecture can be flexibly used with existing pre-trained 3D
encoders and existing generative models. It can flexibly support different
formats, including meshes, point clouds, and radiance fields. Our experiments
demonstrate that Squeeze3D achieves compression ratios of up to 2187x for
textured meshes, 55x for point clouds, and 619x for radiance fields while
maintaining visual quality comparable to many existing methods. Squeeze3D only
incurs a small compression and decompression latency since it does not involve
training object-specific networks to compress an object.