CCM：將條件控制添加到文本到圖像一致性模型

摘要

一致性模型（CMs）在高效且高品質地創建視覺內容方面顯示出潛力。然而，尚未探索將新的條件控制添加到預訓練的CMs的方法。在本技術報告中，我們考慮了為CMs添加類似ControlNet的條件控制的替代策略並提出了三個重要發現。1）針對擴散模型（DMs）訓練的ControlNet可以直接應用於CMs以進行高層語義控制，但在低層細節和逼真控制方面存在困難。2）CMs作為一類獨立的生成模型，可以基於此對其進行從頭開始訓練ControlNet，使用了宋等人提出的一致性訓練。3）通過一致性訓練可以聯合優化輕量級適配器，在多種條件下，實現基於DMs的ControlNet快速轉移到CMs。我們研究了這三種解決方案在各種條件控制上的應用，包括邊緣、深度、人體姿勢、低分辨率圖像以及帶有文本到圖像潛在一致性模型的遮罩圖像。

English

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality. However, the way to add new conditional controls to the pretrained CMs has not been explored. In this technical report, we consider alternative strategies for adding ControlNet-like conditional control to CMs and present three significant findings. 1) ControlNet trained for diffusion models (DMs) can be directly applied to CMs for high-level semantic controls but struggles with low-level detail and realism control. 2) CMs serve as an independent class of generative models, based on which ControlNet can be trained from scratch using Consistency Training proposed by Song et al. 3) A lightweight adapter can be jointly optimized under multiple conditions through Consistency Training, allowing for the swift transfer of DMs-based ControlNet to CMs. We study these three solutions across various conditional controls, including edge, depth, human pose, low-resolution image and masked image with text-to-image latent consistency models.

CCM：將條件控制添加到文本到圖像一致性模型

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

摘要

Support