報酬逆伝播によるテキストから画像への拡散モデルのアライメント

要旨

テキストから画像を生成する拡散モデルは、大規模な教師なしまたは弱教師ありのテキスト-画像トレーニングデータセットを活用し、最近画像生成の最前線に立っています。教師なしトレーニングの性質上、人間が知覚する画像品質の最大化、画像とテキストの整合性、倫理的な画像生成といった下流タスクにおける振る舞いを制御することは困難です。最近の研究では、拡散モデルを下流の報酬関数にファインチューニングするために、勾配推定器の高分散で知られる単純な強化学習が用いられています。本論文では、ノイズ除去プロセスを通じて報酬勾配をエンドツーエンドで逆伝播させることで、拡散モデルを下流の報酬関数に整合させるAlignPropという手法を提案します。このような逆伝播を素朴に実装すると、現代のテキスト-画像モデルの偏微分を保存するために膨大なメモリリソースが必要となりますが、AlignPropは低ランクアダプタ重みモジュールをファインチューニングし、勾配チェックポイントを使用することで、メモリ使用量を実用的な範囲に収めます。AlignPropを、画像とテキストの意味的整合性、美的感覚、圧縮性、存在するオブジェクト数の制御可能性、およびそれらの組み合わせといった様々な目的に対して拡散モデルをファインチューニングする際にテストしました。その結果、AlignPropは代替手法よりも少ないトレーニングステップでより高い報酬を達成し、概念的にも単純であるため、微分可能な関心のある報酬関数に対して拡散モデルを最適化するための直截な選択肢となることを示しました。コードと可視化結果はhttps://align-prop.github.io/で公開されています。

English

Text-to-image diffusion models have recently emerged at the forefront of image generation, powered by very large-scale unsupervised or weakly supervised text-to-image training datasets. Due to their unsupervised training, controlling their behavior in downstream tasks, such as maximizing human-perceived image quality, image-text alignment, or ethical image generation, is difficult. Recent works finetune diffusion models to downstream reward functions using vanilla reinforcement learning, notorious for the high variance of the gradient estimators. In this paper, we propose AlignProp, a method that aligns diffusion models to downstream reward functions using end-to-end backpropagation of the reward gradient through the denoising process. While naive implementation of such backpropagation would require prohibitive memory resources for storing the partial derivatives of modern text-to-image models, AlignProp finetunes low-rank adapter weight modules and uses gradient checkpointing, to render its memory usage viable. We test AlignProp in finetuning diffusion models to various objectives, such as image-text semantic alignment, aesthetics, compressibility and controllability of the number of objects present, as well as their combinations. We show AlignProp achieves higher rewards in fewer training steps than alternatives, while being conceptually simpler, making it a straightforward choice for optimizing diffusion models for differentiable reward functions of interest. Code and Visualization results are available at https://align-prop.github.io/.

報酬逆伝播によるテキストから画像への拡散モデルのアライメント

Aligning Text-to-Image Diffusion Models with Reward Backpropagation

要旨

Support