ChatPaper.aiChatPaper

TeEFusion:融合文本嵌入以蒸馏无分类器引导

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

July 24, 2025
作者: Minghao Fu, Guo-Hua Wang, Xiaohao Chen, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang
cs.AI

摘要

近期,文本到圖像合成技術的顯著進步主要得益於精細的採樣策略和無分類器指導(CFG),以確保生成高品質的圖像。然而,CFG依賴於兩次前向傳播,尤其是在結合複雜的採樣算法時,導致了極高的推理成本。為解決這一問題,我們引入了TeEFusion(文本嵌入融合),這是一種新穎且高效的蒸餾方法,它直接將指導幅度融入文本嵌入中,並蒸餾教師模型的複雜採樣策略。通過簡單地使用線性操作融合條件與非條件文本嵌入,TeEFusion在不增加額外參數的情況下重建了所需的指導,同時使學生模型能夠從教師模型通過其精細採樣方法產生的輸出中學習。在如SD3等最先進模型上的廣泛實驗表明,我們的方法使學生模型能夠以更簡單且更高效的採樣策略緊密模仿教師模型的表現。因此,學生模型的推理速度比教師模型快達6倍,同時保持的圖像質量與通過教師模型複雜採樣方法獲得的水平相當。代碼已公開於https://github.com/AIDC-AI/TeEFusion{github.com/AIDC-AI/TeEFusion}。
English
Recent advances in text-to-image synthesis largely benefit from sophisticated sampling strategies and classifier-free guidance (CFG) to ensure high-quality generation. However, CFG's reliance on two forward passes, especially when combined with intricate sampling algorithms, results in prohibitively high inference costs. To address this, we introduce TeEFusion (Text Embeddings Fusion), a novel and efficient distillation method that directly incorporates the guidance magnitude into the text embeddings and distills the teacher model's complex sampling strategy. By simply fusing conditional and unconditional text embeddings using linear operations, TeEFusion reconstructs the desired guidance without adding extra parameters, simultaneously enabling the student model to learn from the teacher's output produced via its sophisticated sampling approach. Extensive experiments on state-of-the-art models such as SD3 demonstrate that our method allows the student to closely mimic the teacher's performance with a far simpler and more efficient sampling strategy. Consequently, the student model achieves inference speeds up to 6times faster than the teacher model, while maintaining image quality at levels comparable to those obtained through the teacher's complex sampling approach. The code is publicly available at https://github.com/AIDC-AI/TeEFusion{github.com/AIDC-AI/TeEFusion}.
PDF72July 25, 2025