属性编辑:图像中物体的固有属性调整
Alterbute: Editing Intrinsic Attributes of Objects in Images
January 15, 2026
作者: Tal Reiss, Daniel Winter, Matan Cohen, Alex Rav-Acha, Yael Pritch, Ariel Shamir, Yedid Hoshen
cs.AI
摘要
我们提出Alterbute——一种基于扩散模型的图像物体本征属性编辑方法。该方法能够改变物体的颜色、纹理、材质甚至形状,同时保持其感知身份与场景上下文。现有方法要么依赖难以保持身份特征的无监督先验,要么采用过度严格的监督机制而限制了有意义的本质属性变化。我们的技术核心在于:(一)采用宽松的训练目标,使模型能根据身份参考图像、描述目标本征属性的文本提示、以及定义外在背景的背景图像与物体掩码,同时改变本征与非本征属性。在推理阶段,通过复用原始背景与物体掩码来限制非本征变化,从而确保仅目标本征属性被修改;(二)提出视觉命名实体(VNE)概念——即根据共享身份定义特征(如"保时捷911卡雷拉")划分的细粒度视觉身份类别,此类别允许本征属性存在差异。我们利用视觉语言模型从大型公共图像数据集中自动提取VNE标签和本征属性描述,实现了可扩展的身份保持监督。实验表明,Alterbute在保持物体身份的本征属性编辑任务上优于现有方法。
English
We introduce Alterbute, a diffusion-based method for editing an object's intrinsic attributes in an image. We allow changing color, texture, material, and even the shape of an object, while preserving its perceived identity and scene context. Existing approaches either rely on unsupervised priors that often fail to preserve identity or use overly restrictive supervision that prevents meaningful intrinsic variations. Our method relies on: (i) a relaxed training objective that allows the model to change both intrinsic and extrinsic attributes conditioned on an identity reference image, a textual prompt describing the target intrinsic attributes, and a background image and object mask defining the extrinsic context. At inference, we restrict extrinsic changes by reusing the original background and object mask, thereby ensuring that only the desired intrinsic attributes are altered; (ii) Visual Named Entities (VNEs) - fine-grained visual identity categories (e.g., ''Porsche 911 Carrera'') that group objects sharing identity-defining features while allowing variation in intrinsic attributes. We use a vision-language model to automatically extract VNE labels and intrinsic attribute descriptions from a large public image dataset, enabling scalable, identity-preserving supervision. Alterbute outperforms existing methods on identity-preserving object intrinsic attribute editing.