MoKus:利用跨模态知识迁移实现知识感知型概念定制
MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization
March 13, 2026
作者: Chenyang Zhu, Hongxiang Li, Xiu Li, Long Chen
cs.AI
摘要
概念定制方法通常将稀有标记与目标概念进行绑定。然而这类方法存在性能不稳定的缺陷,因为预训练数据中很少包含这些稀有标记。与此同时,这些稀有标记难以承载目标概念的内在知识。为此,我们提出知识感知概念定制这一新任务,旨在将多样化文本知识绑定到目标视觉概念上。该任务要求模型能够识别文本提示中的知识要素,从而实现高保真度的定制化生成,同时需要高效地将所有文本知识绑定至目标概念。基于此,我们提出创新框架MoKus,其核心洞见在于跨模态知识迁移——当文本模态中的知识被修改时,这种变化会在生成过程中自然传递到视觉模态。受此启发,MoKus包含两个阶段:(1) 视觉概念学习阶段,首先通过锚点表征存储目标概念的视觉信息;(2) 文本知识更新阶段,将知识查询的答案更新为锚点表征,从而实现高保真定制生成。为系统评估新任务上的表现,我们构建了首个知识感知概念定制基准KnowCusBench。大量实验表明,MoKus在各项指标上均优于现有最优方法。此外,跨模态知识迁移特性使MoKus能轻松扩展至虚拟概念创建、概念擦除等知识感知应用场景。我们还在世界知识基准测试中验证了本方法带来的性能提升。
English
Concept customization typically binds rare tokens to a target concept. Unfortunately, these approaches often suffer from unstable performance as the pretraining data seldom contains these rare tokens. Meanwhile, these rare tokens fail to convey the inherent knowledge of the target concept. Consequently, we introduce Knowledge-aware Concept Customization, a novel task aiming at binding diverse textual knowledge to target visual concepts. This task requires the model to identify the knowledge within the text prompt to perform high-fidelity customized generation. Meanwhile, the model should efficiently bind all the textual knowledge to the target concept. Therefore, we propose MoKus, a novel framework for knowledge-aware concept customization. Our framework relies on a key observation: cross-modal knowledge transfer, where modifying knowledge within the text modality naturally transfers to the visual modality during generation. Inspired by this observation, MoKus contains two stages: (1) In visual concept learning, we first learn the anchor representation to store the visual information of the target concept. (2) In textual knowledge updating, we update the answer for the knowledge queries to the anchor representation, enabling high-fidelity customized generation. To further comprehensively evaluate our proposed MoKus on the new task, we introduce the first benchmark for knowledge-aware concept customization: KnowCusBench. Extensive evaluations have demonstrated that MoKus outperforms state-of-the-art methods. Moreover, the cross-model knowledge transfer allows MoKus to be easily extended to other knowledge-aware applications like virtual concept creation and concept erasure. We also demonstrate the capability of our method to achieve improvements on world knowledge benchmarks.