無需偏好圖像對的文本到圖像擴散模型免費午餐對齊

摘要

基於擴散模型的文本到圖像（T2I）技術近期取得了顯著進展，能夠從文本提示生成高質量的圖像。然而，確保生成圖像與文本之間的精確對齊仍然是當前最先進擴散模型面臨的重大挑戰。為解決這一問題，現有研究採用基於人類反饋的強化學習（RLHF）來使T2I輸出更符合人類偏好。這些方法要么直接依賴配對的圖像偏好數據，要么需要學習獎勵函數，兩者都高度依賴成本高昂的高質量人工標註，因而面臨可擴展性限制。在本研究中，我們提出了文本偏好優化（TPO）框架，實現了T2I模型的“免費午餐”對齊，即無需配對圖像偏好數據即可達成對齊。TPO通過訓練模型偏好匹配的提示而非不匹配的提示來工作，其中不匹配提示是使用大型語言模型對原始描述進行擾動構建的。我們的框架具有通用性，可與現有的基於偏好的算法兼容。我們將DPO和KTO擴展到我們的設置中，分別得到TDPO和TKTO。在多個基準上的定量和定性評估表明，我們的方法始終優於其原始版本，提供了更好的人類偏好分數和改進的文本到圖像對齊效果。我們的開源代碼可在https://github.com/DSL-Lab/T2I-Free-Lunch-Alignment獲取。

English

Recent advances in diffusion-based text-to-image (T2I) models have led to remarkable success in generating high-quality images from textual prompts. However, ensuring accurate alignment between the text and the generated image remains a significant challenge for state-of-the-art diffusion models. To address this, existing studies employ reinforcement learning with human feedback (RLHF) to align T2I outputs with human preferences. These methods, however, either rely directly on paired image preference data or require a learned reward function, both of which depend heavily on costly, high-quality human annotations and thus face scalability limitations. In this work, we introduce Text Preference Optimization (TPO), a framework that enables "free-lunch" alignment of T2I models, achieving alignment without the need for paired image preference data. TPO works by training the model to prefer matched prompts over mismatched prompts, which are constructed by perturbing original captions using a large language model. Our framework is general and compatible with existing preference-based algorithms. We extend both DPO and KTO to our setting, resulting in TDPO and TKTO. Quantitative and qualitative evaluations across multiple benchmarks show that our methods consistently outperform their original counterparts, delivering better human preference scores and improved text-to-image alignment. Our Open-source code is available at https://github.com/DSL-Lab/T2I-Free-Lunch-Alignment.

無需偏好圖像對的文本到圖像擴散模型免費午餐對齊

Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs

摘要

Support