空间编辑:细粒度图像空间编辑基准测试
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
April 6, 2026
作者: Yicheng Xiao, Wenhu Zhang, Lin Song, Yukang Chen, Wenbo Li, Nan Jiang, Tianhe Ren, Haokun Lin, Wei Huang, Haoyang Huang, Xiu Li, Nan Duan, Xiaojuan Qi
cs.AI
摘要
图像空间编辑通过几何驱动的变换实现,能够精确控制物体布局与相机视角。现有模型难以胜任细粒度空间操控,这促使我们构建专用评估体系。我们的贡献包括:(i) 提出SpatialEdit-Bench综合基准,通过视角重建与构图分析联合度量感知合理性与几何保真度,全面评估空间编辑能力;(ii) 针对可扩展训练的数据瓶颈,构建SpatialEdit-500k合成数据集——采用可控Blender管线生成,在多样化背景中渲染物体并系统化采集相机轨迹,为物体中心与相机中心操作提供精确的真值变换;(iii) 基于此数据开发SpatialEdit-16B基线模型,实现细粒度空间编辑。该方法在通用编辑任务中表现具有竞争力,并在空间操控任务上显著超越现有方法。所有资源将开源于https://github.com/EasonXiao-888/SpatialEdit。
English
Image spatial editing performs geometry-driven transformations, allowing precise control over object layout and camera viewpoints. Current models are insufficient for fine-grained spatial manipulations, motivating a dedicated assessment suite. Our contributions are listed: (i) We introduce SpatialEdit-Bench, a complete benchmark that evaluates spatial editing by jointly measuring perceptual plausibility and geometric fidelity via viewpoint reconstruction and framing analysis. (ii) To address the data bottleneck for scalable training, we construct SpatialEdit-500k, a synthetic dataset generated with a controllable Blender pipeline that renders objects across diverse backgrounds and systematic camera trajectories, providing precise ground-truth transformations for both object- and camera-centric operations. (iii) Building on this data, we develop SpatialEdit-16B, a baseline model for fine-grained spatial editing. Our method achieves competitive performance on general editing while substantially outperforming prior methods on spatial manipulation tasks. All resources will be made public at https://github.com/EasonXiao-888/SpatialEdit.