ChatPaper.aiChatPaper

高斯混合器:基于解缠潜在空间的3D高斯模型即时风格化

GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

December 3, 2025
作者: Melis Ocal, Xiaoyan Xing, Yue Li, Ngo Anh Vien, Sezer Karaoglu, Theo Gevers
cs.AI

摘要

三维风格化技术是游戏开发、虚拟现实和数字艺术的核心领域,多样化的资产需求催生了支持快速高保真操控的可扩展方法。现有文本驱动3D风格化方法通常基于2D图像编辑器进行知识蒸馏,不仅需要对每个资产进行耗时的单独优化,还因当前文本-图像生成模型的局限性存在多视角不一致问题,导致其难以适用于大规模生产。本文提出突破性前馈框架GaussianBlender,该框架在推理阶段可即时完成文本驱动的三维风格化编辑。我们的方法通过空间分组的3D高斯模型学习具有可控信息共享机制的解耦隐空间,分别表征几何与外观属性,并利用隐扩散模型对这些学习到的表征进行文本条件编辑。综合评估表明,GaussianBlender不仅能实现即时、高保真、保持几何特征且多视角一致的风格化效果,其性能更超越了需要逐实例测试时优化的方法——为实现规模化、平民化的实用三维风格化开启了新途径。
English
3D stylization is central to game development, virtual reality, and digital arts, where the demand for diverse assets calls for scalable methods that support fast, high-fidelity manipulation. Existing text-to-3D stylization methods typically distill from 2D image editors, requiring time-intensive per-asset optimization and exhibiting multi-view inconsistency due to the limitations of current text-to-image models, which makes them impractical for large-scale production. In this paper, we introduce GaussianBlender, a pioneering feed-forward framework for text-driven 3D stylization that performs edits instantly at inference. Our method learns structured, disentangled latent spaces with controlled information sharing for geometry and appearance from spatially-grouped 3D Gaussians. A latent diffusion model then applies text-conditioned edits on these learned representations. Comprehensive evaluations show that GaussianBlender not only delivers instant, high-fidelity, geometry-preserving, multi-view consistent stylization, but also surpasses methods that require per-instance test-time optimization - unlocking practical, democratized 3D stylization at scale.
PDF11December 6, 2025