ChatPaper.aiChatPaper

UP2You:从无约束照片集中快速重建自我

UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

September 29, 2025
作者: Zeyu Cai, Ziyang Li, Xiaoben Li, Boqian Li, Zeyu Wang, Zhenyu Zhang, Yuliang Xiu
cs.AI

摘要

我们推出UP2You,这是首个无需调优即可从极度无约束的野外二维照片中重建高保真三维穿衣人像的解决方案。与以往需要“干净”输入(如全身图像且遮挡最少,或经过良好校准的跨视角捕捉)的方法不同,UP2You直接处理原始、非结构化的照片,这些照片在姿态、视角、裁剪和遮挡方面可能存在显著差异。我们摒弃了将数据压缩为标记以进行缓慢的在线文本到三维优化的做法,而是引入了一种数据校正范式,该范式能在单次前向传播中,在几秒内高效地将无约束输入转换为干净、正交的多视角图像,从而简化三维重建过程。UP2You的核心是一个姿态关联特征聚合模块(PCFA),它根据目标姿态有选择地融合来自多个参考图像的信息,实现了更好的身份保持和近乎恒定的内存占用,同时增加了观测次数。我们还引入了一种基于感知器的多参考形状预测器,消除了对预捕捉身体模板的需求。在4D-Dress、PuzzleIOI及野外捕捉数据集上的大量实验表明,UP2You在几何精度(PuzzleIOI上Chamfer降低15%,P2S降低18%)和纹理保真度(4D-Dress上PSNR提升21%,LPIPS降低46%)方面均持续超越先前方法。UP2You高效(每人1.5分钟),且功能多样(支持任意姿态控制,以及无需训练的多服装三维虚拟试穿),使其适用于人类被随意捕捉的真实场景。我们将发布模型和代码,以促进这一尚未充分探索任务的研究。项目页面:https://zcai0612.github.io/UP2You
English
We present UP2You, the first tuning-free solution for reconstructing high-fidelity 3D clothed portraits from extremely unconstrained in-the-wild 2D photos. Unlike previous approaches that require "clean" inputs (e.g., full-body images with minimal occlusions, or well-calibrated cross-view captures), UP2You directly processes raw, unstructured photographs, which may vary significantly in pose, viewpoint, cropping, and occlusion. Instead of compressing data into tokens for slow online text-to-3D optimization, we introduce a data rectifier paradigm that efficiently converts unconstrained inputs into clean, orthogonal multi-view images in a single forward pass within seconds, simplifying the 3D reconstruction. Central to UP2You is a pose-correlated feature aggregation module (PCFA), that selectively fuses information from multiple reference images w.r.t. target poses, enabling better identity preservation and nearly constant memory footprint, with more observations. We also introduce a perceiver-based multi-reference shape predictor, removing the need for pre-captured body templates. Extensive experiments on 4D-Dress, PuzzleIOI, and in-the-wild captures demonstrate that UP2You consistently surpasses previous methods in both geometric accuracy (Chamfer-15%, P2S-18% on PuzzleIOI) and texture fidelity (PSNR-21%, LPIPS-46% on 4D-Dress). UP2You is efficient (1.5 minutes per person), and versatile (supports arbitrary pose control, and training-free multi-garment 3D virtual try-on), making it practical for real-world scenarios where humans are casually captured. Both models and code will be released to facilitate future research on this underexplored task. Project Page: https://zcai0612.github.io/UP2You
PDF53October 10, 2025