缪斯：无需训练即可设计、创作、生成虚构幻想3D生物

摘要

我们推出Muses——首个基于前馈式范式的免训练奇幻3D生物生成方法。现有方法依赖部件感知优化、人工组装或2D图像生成，由于精细部件级操控的挑战及跨域生成能力有限，常产生不真实或不协调的3D资源。相较之下，Muses利用3D骨架这一生物形态的基础表征，以显式且合理的方式组合多元元素。该骨骼基础将3D内容创作形式化为包含设计、组合与生成的结构化流程。Muses首先通过图约束推理构建具有协调布局与比例的创新性3D骨架，随后在结构化潜空间内引导基于体素的组装过程，整合来自不同对象的区域。最后在骨骼约束下实施图像引导的外观建模，为组装形态生成风格统一且和谐一致的纹理。大量实验表明，Muses在视觉保真度、文本描述对齐度方面达到业界最优水平，并展现出灵活的3D对象编辑潜力。项目页面：https://luhexiao.github.io/Muses.github.io/。

English

We present Muses, the first training-free method for fantastic 3D creature generation in a feed-forward paradigm. Previous methods, which rely on part-aware optimization, manual assembly, or 2D image generation, often produce unrealistic or incoherent 3D assets due to the challenges of intricate part-level manipulation and limited out-of-domain generation. In contrast, Muses leverages the 3D skeleton, a fundamental representation of biological forms, to explicitly and rationally compose diverse elements. This skeletal foundation formalizes 3D content creation as a structure-aware pipeline of design, composition, and generation. Muses begins by constructing a creatively composed 3D skeleton with coherent layout and scale through graph-constrained reasoning. This skeleton then guides a voxel-based assembly process within a structured latent space, integrating regions from different objects. Finally, image-guided appearance modeling under skeletal conditions is applied to generate a style-consistent and harmonious texture for the assembled shape. Extensive experiments establish Muses' state-of-the-art performance in terms of visual fidelity and alignment with textual descriptions, and potential on flexible 3D object editing. Project page: https://luhexiao.github.io/Muses.github.io/.