ChatPaper.aiChatPaper

PoseDreamer:基于扩散模型的可扩展式照片级人体数据生成流程

PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models

March 30, 2026
作者: Lorenza Prospero, Orest Kupyn, Ostap Viniavskyi, João F. Henriques, Christian Rupprecht
cs.AI

摘要

由於深度模糊性以及從單目圖像標註三維幾何的固有難度,獲取用於三維人體網格估計的標註數據集極具挑戰性。現有數據集要么是通過人工標註三維幾何的真實數據集(規模有限),要么是基於三維引擎渲染的合成數據集(雖能提供精確標註但存在逼真度不足、多樣性低且製作成本高的問題)。本研究探索了第三條路徑:生成式數據。我們提出PoseDreamer——一種創新流程,利用擴散模型生成帶有三維網格標註的大規模合成數據集。該方法融合可控圖像生成與基於直接偏好優化的控制對齊技術,結合課程式難樣本挖掘和多階段質量篩選機制。這些組件共同確保了三維標註與生成圖像間的天然對應關係,同時側重挖掘挑戰性樣本以最大化數據集效用。通過PoseDreamer,我們生成了超過50萬個高質量合成樣本,其圖像質量指標相較基於渲染的數據集提升76%。使用PoseDreamer訓練的模型性能可媲美甚至超越基於真實世界和傳統合成數據集訓練的模型。此外,將PoseDreamer與合成數據集結合使用時,其表現優於真實世界與合成數據集的組合,證明了本數據集的互補特性。我們將公開完整數據集及生成代碼。
English
Acquiring labeled datasets for 3D human mesh estimation is challenging due to depth ambiguities and the inherent difficulty of annotating 3D geometry from monocular images. Existing datasets are either real, with manually annotated 3D geometry and limited scale, or synthetic, rendered from 3D engines that provide precise labels but suffer from limited photorealism, low diversity, and high production costs. In this work, we explore a third path: generated data. We introduce PoseDreamer, a novel pipeline that leverages diffusion models to generate large-scale synthetic datasets with 3D mesh annotations. Our approach combines controllable image generation with Direct Preference Optimization for control alignment, curriculum-based hard sample mining, and multi-stage quality filtering. Together, these components naturally maintain correspondence between 3D labels and generated images, while prioritizing challenging samples to maximize dataset utility. Using PoseDreamer, we generate more than 500,000 high-quality synthetic samples, achieving a 76% improvement in image-quality metrics compared to rendering-based datasets. Models trained on PoseDreamer achieve performance comparable to or superior to those trained on real-world and traditional synthetic datasets. In addition, combining PoseDreamer with synthetic datasets results in better performance than combining real-world and synthetic datasets, demonstrating the complementary nature of our dataset. We will release the full dataset and generation code.
PDF41April 2, 2026