InfiniHuman:具備精確控制的無限3D人體生成技術
InfiniHuman: Infinite 3D Human Creation with Precise Control
October 13, 2025
作者: Yuxuan Xue, Xianghui Xie, Margaret Kostyrko, Gerard Pons-Moll
cs.AI
摘要
生成逼真且可控的3D人體化身是一項長期存在的挑戰,尤其是在涵蓋廣泛屬性範圍時,如種族、年齡、服裝風格和細緻的體型。捕捉並註釋大規模的人類數據集以訓練生成模型成本高昂,且在規模和多樣性上受限。本文探討的核心問題是:能否通過蒸餾現有的基礎模型來生成理論上無界限、豐富註釋的3D人體數據?我們引入了InfiniHuman,這是一個協同蒸餾這些模型的框架,以最低成本生成豐富註釋的人體數據,並具備理論上的無限擴展性。我們提出了InfiniHumanData,這是一個完全自動化的流程,利用視覺-語言和圖像生成模型創建大規模多模態數據集。用戶研究表明,我們自動生成的身份與掃描渲染圖像無法區分。InfiniHumanData包含111,000個身份,涵蓋前所未有的多樣性。每個身份都配有多粒度文本描述、多視角RGB圖像、詳細服裝圖像和SMPL體型參數。基於此數據集,我們提出了InfiniHumanGen,這是一個基於擴散的生成流程,條件依賴於文本、體型和服裝資產。InfiniHumanGen能夠快速、逼真且精確可控地生成化身。大量實驗表明,在視覺質量、生成速度和可控性方面,我們的方法相較於最先進的技術有顯著提升。我們的方法通過實用且經濟的解決方案,實現了高質量化身的生成,並具備細粒度控制,達到理論上無界限的規模。我們將在https://yuxuan-xue.com/infini-human上公開自動數據生成流程、全面的InfiniHumanData數據集以及InfiniHumanGen模型。
English
Generating realistic and controllable 3D human avatars is a long-standing
challenge, particularly when covering broad attribute ranges such as ethnicity,
age, clothing styles, and detailed body shapes. Capturing and annotating
large-scale human datasets for training generative models is prohibitively
expensive and limited in scale and diversity. The central question we address
in this paper is: Can existing foundation models be distilled to generate
theoretically unbounded, richly annotated 3D human data? We introduce
InfiniHuman, a framework that synergistically distills these models to produce
richly annotated human data at minimal cost and with theoretically unlimited
scalability. We propose InfiniHumanData, a fully automatic pipeline that
leverages vision-language and image generation models to create a large-scale
multi-modal dataset. User study shows our automatically generated identities
are undistinguishable from scan renderings. InfiniHumanData contains 111K
identities spanning unprecedented diversity. Each identity is annotated with
multi-granularity text descriptions, multi-view RGB images, detailed clothing
images, and SMPL body-shape parameters. Building on this dataset, we propose
InfiniHumanGen, a diffusion-based generative pipeline conditioned on text, body
shape, and clothing assets. InfiniHumanGen enables fast, realistic, and
precisely controllable avatar generation. Extensive experiments demonstrate
significant improvements over state-of-the-art methods in visual quality,
generation speed, and controllability. Our approach enables high-quality avatar
generation with fine-grained control at effectively unbounded scale through a
practical and affordable solution. We will publicly release the automatic data
generation pipeline, the comprehensive InfiniHumanData dataset, and the
InfiniHumanGen models at https://yuxuan-xue.com/infini-human.