人間の効用を最適化することで拡散モデルを整合させる

要旨

本論文では、テキストから画像を生成する拡散モデルを人間の効用期待値の最大化として定式化することでアライメントを行う新手法、Diffusion-KTOを提案する。この目的関数は各生成に対して独立に適用されるため、Diffusion-KTOは高コストなペアワイズ選好データの収集や複雑な報酬モデルの学習を必要としない。代わりに、本手法では「いいね」や「嫌い」といった単純な画像ごとの二値フィードバック信号を利用する。このようなデータは豊富に存在する。Diffusion-KTOを用いてファインチューニングを行った結果、テキストから画像を生成する拡散モデルは、教師ありファインチューニングやDiffusion-DPOなどの既存手法と比較して、人間による評価だけでなくPickScoreやImageRewardといった自動評価指標においても優れた性能を示した。全体として、Diffusion-KTOは容易に入手可能な画像ごとの二値信号を活用する可能性を開拓し、テキストから画像を生成する拡散モデルを人間の選好に沿ってアライメントする手法の適用範囲を広げるものである。

English

We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each generation independently, Diffusion-KTO does not require collecting costly pairwise preference data nor training a complex reward model. Instead, our objective requires simple per-image binary feedback signals, e.g. likes or dislikes, which are abundantly available. After fine-tuning using Diffusion-KTO, text-to-image diffusion models exhibit superior performance compared to existing techniques, including supervised fine-tuning and Diffusion-DPO, both in terms of human judgment and automatic evaluation metrics such as PickScore and ImageReward. Overall, Diffusion-KTO unlocks the potential of leveraging readily available per-image binary signals and broadens the applicability of aligning text-to-image diffusion models with human preferences.

人間の効用を最適化することで拡散モデルを整合させる

Aligning Diffusion Models by Optimizing Human Utility

要旨

Support