ChatPaper.aiChatPaper

擴散模板:可控擴散的統一外掛程式框架

Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

April 27, 2026
作者: Zhongjie Duan, Hong Zhang, Yingda Chen
cs.AI

摘要

可控擴散方法顯著提升了擴散模型的實用價值,但現有方案通常以相互孤立的專用系統形式存在,具有互不相容的訓練流程、參數格式和運行時掛鈎。這種碎片化現狀導致難以跨任務復用基礎設施、跨骨幹網絡遷移能力,或在單一生成管道中組合多種控制條件。我們提出「擴散模板」——一個統一開源的插件框架,通過將基礎模型推理與可控能力注入解耦來解決上述問題。該框架包含三個核心組件:將任意任務特定輸入映射為中間能力表徵的模板模型、作為標準化能力注入接口的模板緩存,以及負責加載、融合並將多個模板緩存注入基礎擴散運行時的模板管道。由於接口定義在系統層面而非綁定特定控制架構,KV緩存與LoRA等異構能力載體可在同一抽象層下獲得支持。基於此設計,我們構建了覆蓋結構控制、亮度調節、色彩調整、圖像編輯、超分辨率、銳化增強、美學對齊、內容參照、局部修復和年齡控制的多樣化模型庫。案例研究表明,擴散模板能在保持模塊化、可組合性與實踐可擴展性的同時,統一各類可控生成任務,並適應快速迭代的擴散模型骨幹網絡。所有資源包括代碼、模型和數據集均將開源。
English
Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it difficult to reuse infrastructure across tasks, transfer capabilities across backbones, or compose multiple controls within a single generation pipeline. We present Diffusion Templates, a unified and open plugin framework that decouples base-model inference from controllable capability injection. The framework is organized around three components: Template models that map arbitrary task-specific inputs to an intermediate capability representation, a Template cache that functions as a standardized interface for capability injection, and a Template pipeline that loads, merges, and injects one or more Template caches into the base diffusion runtime. Because the interface is defined at the systems level rather than tied to a specific control architecture, heterogeneous capability carriers such as KV-Cache and LoRA can be supported under the same abstraction. Based on this design, we build a diverse model zoo spanning structural control, brightness adjustment, color adjustment, image editing, super-resolution, sharpness enhancement, aesthetic alignment, content reference, local inpainting, and age control. These case studies show that Diffusion Templates can unify a broad range of controllable generation tasks while preserving modularity, composability, and practical extensibility across rapidly evolving diffusion backbones. All resources will be open sourced, including code, models, and datasets.
PDF62May 1, 2026