ChatPaper.aiChatPaper

全属性:面向视觉概念个性化的开放词汇属性编码器

Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

December 11, 2025
作者: Tsai-Shien Chen, Aliaksandr Siarohin, Guocheng Gordon Qian, Kuan-Chieh Jackson Wang, Egor Nemchinov, Moayed Haji-Ali, Riza Alp Guler, Willi Menapace, Ivan Skorokhodov, Anil Kag, Jun-Yan Zhu, Sergey Tulyakov
cs.AI

摘要

视觉概念个性化技术旨在将特定图像属性(如身份特征、表情神态、光照条件与艺术风格)迁移至未知场景中。然而现有方法依赖通用图像编码器提取的整体嵌入向量,这些向量往往纠缠多种视觉要素,难以分离单一属性,导致信息泄露与合成结果失真的问题。为解决此局限,我们提出全属性编码器——首个专为学习高保真度、属性特异性表征而设计的开放词汇图像属性编码框架。本研究采用数据与模型协同设计的思路:首先构建带有正负属性标注的语义关联图像对,显式指导编码器学习保留与抑制的要素;其次采用双目标训练范式,平衡生成保真度与对比解耦能力。实验表明,所得嵌入向量在开放词汇属性检索、个性化定制与组合生成任务中均表现优异,在多项基准测试中达到最先进性能。
English
Visual concept personalization aims to transfer only specific image attributes, such as identity, expression, lighting, and style, into unseen contexts. However, existing methods rely on holistic embeddings from general-purpose image encoders, which entangle multiple visual factors and make it difficult to isolate a single attribute. This often leads to information leakage and incoherent synthesis. To address this limitation, we introduce Omni-Attribute, the first open-vocabulary image attribute encoder designed to learn high-fidelity, attribute-specific representations. Our approach jointly designs the data and model: (i) we curate semantically linked image pairs annotated with positive and negative attributes to explicitly teach the encoder what to preserve or suppress; and (ii) we adopt a dual-objective training paradigm that balances generative fidelity with contrastive disentanglement. The resulting embeddings prove effective for open-vocabulary attribute retrieval, personalization, and compositional generation, achieving state-of-the-art performance across multiple benchmarks.
PDF21December 13, 2025