ChatPaper.aiChatPaper

高效文本引导卷积适配器在扩散模型中的应用

Efficient Text-Guided Convolutional Adapter for the Diffusion Model

February 16, 2026
作者: Aryan Das, Koushik Biswas, Swalpa Kumar Roy, Badri Narayana Patro, Vinay Kumar Verma
cs.AI

摘要

我们推出Nexus适配器——一种面向基于扩散框架的结构保持条件生成(SPCG)的新型文本引导高效适配器。近年来,结构保持方法通过采用基础模型处理提示条件、适配器处理结构输入(如草图或深度图),在条件图像生成领域取得了显著成果。然而这类方法存在明显缺陷:适配器有时需配备与基础架构等量的参数,导致效率低下;且由于扩散模型本身训练成本高昂,参数翻倍将造成极大资源浪费。更重要的是,现有适配器无法感知输入提示,导致其仅能优化结构输入而无法协同提示信息。为突破这些局限,我们提出了两种由提示与结构输入共同引导的高效适配器:Nexus Prime与Nexus Slim。每个Nexus模块通过交叉注意力机制实现丰富的多模态条件融合,使适配器在保持结构的同时能深度理解输入提示。大量实验表明,Nexus Prime适配器仅需增加800万参数即可显著提升性能,优于基线模型T2I-Adapter;而轻量级版本Nexus Slim更以减少1800万参数的配置,仍达到业界最优效果。代码已开源:https://github.com/arya-domain/Nexus-Adapters
English
We introduce the Nexus Adapters, novel text-guided efficient adapters to the diffusion-based framework for the Structure Preserving Conditional Generation (SPCG). Recently, structure-preserving methods have achieved promising results in conditional image generation by using a base model for prompt conditioning and an adapter for structure input, such as sketches or depth maps. These approaches are highly inefficient and sometimes require equal parameters in the adapter compared to the base architecture. It is not always possible to train the model since the diffusion model is itself costly, and doubling the parameter is highly inefficient. In these approaches, the adapter is not aware of the input prompt; therefore, it is optimal only for the structural input but not for the input prompt. To overcome the above challenges, we proposed two efficient adapters, Nexus Prime and Slim, which are guided by prompts and structural inputs. Each Nexus Block incorporates cross-attention mechanisms to enable rich multimodal conditioning. Therefore, the proposed adapter has a better understanding of the input prompt while preserving the structure. We conducted extensive experiments on the proposed models and demonstrated that the Nexus Prime adapter significantly enhances performance, requiring only 8M additional parameters compared to the baseline, T2I-Adapter. Furthermore, we also introduced a lightweight Nexus Slim adapter with 18M fewer parameters than the T2I-Adapter, which still achieved state-of-the-art results. Code: https://github.com/arya-domain/Nexus-Adapters
PDF72March 28, 2026