MoKus:利用跨模态知识迁移实现知识感知型概念定制
MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization
March 13, 2026
作者: Chenyang Zhu, Hongxiang Li, Xiu Li, Long Chen
cs.AI
摘要
概念定制方法通常将稀有标记与目标概念进行绑定。然而由于预训练数据中很少包含这些稀有标记,此类方法常面临性能不稳定的问题。同时,这些稀有标记难以传递目标概念的内在知识。为此,我们提出知识感知概念定制这一新任务,旨在将多样化的文本知识绑定到目标视觉概念上。该任务要求模型能够识别文本提示中的知识,从而实现高保真度的定制化生成,同时高效地将所有文本知识绑定至目标概念。基于此,我们提出创新框架MoKus,其核心洞见在于跨模态知识迁移——即文本模态中的知识修改会在生成过程中自然迁移至视觉模态。受此启发,MoKus包含两个阶段:(1)在视觉概念学习阶段,我们首先学习用于存储目标概念视觉信息的锚点表征;(2)在文本知识更新阶段,我们将知识查询的答案更新为锚点表征,从而实现高保真度的定制生成。为了系统评估MoKus在新任务上的表现,我们构建了首个知识感知概念定制基准KnowCusBench。大量实验表明,MoKus在性能上超越现有最优方法。此外,跨模态知识迁移特性使MoKus可轻松扩展至虚拟概念创建、概念擦除等知识感知应用场景。我们进一步验证了该方法在世界知识基准测试中的提升效果。
English
Concept customization typically binds rare tokens to a target concept. Unfortunately, these approaches often suffer from unstable performance as the pretraining data seldom contains these rare tokens. Meanwhile, these rare tokens fail to convey the inherent knowledge of the target concept. Consequently, we introduce Knowledge-aware Concept Customization, a novel task aiming at binding diverse textual knowledge to target visual concepts. This task requires the model to identify the knowledge within the text prompt to perform high-fidelity customized generation. Meanwhile, the model should efficiently bind all the textual knowledge to the target concept. Therefore, we propose MoKus, a novel framework for knowledge-aware concept customization. Our framework relies on a key observation: cross-modal knowledge transfer, where modifying knowledge within the text modality naturally transfers to the visual modality during generation. Inspired by this observation, MoKus contains two stages: (1) In visual concept learning, we first learn the anchor representation to store the visual information of the target concept. (2) In textual knowledge updating, we update the answer for the knowledge queries to the anchor representation, enabling high-fidelity customized generation. To further comprehensively evaluate our proposed MoKus on the new task, we introduce the first benchmark for knowledge-aware concept customization: KnowCusBench. Extensive evaluations have demonstrated that MoKus outperforms state-of-the-art methods. Moreover, the cross-model knowledge transfer allows MoKus to be easily extended to other knowledge-aware applications like virtual concept creation and concept erasure. We also demonstrate the capability of our method to achieve improvements on world knowledge benchmarks.