RewardSDS: 報酬重み付きサンプリングによるスコア蒸留のアラインメント

要旨

スコア蒸留サンプリング（SDS）は、テキストから3D生成などのタスクにおいて2D拡散事前分布を活用する効果的な技術として登場しました。しかし、SDSはユーザーの意図に細かく一致させることに苦戦しています。これを克服するため、我々はRewardSDSを提案します。これは、報酬モデルからの整合性スコアに基づいてノイズサンプルを重み付けし、重み付きSDS損失を生成する新しいアプローチです。この損失は、整合性の高い高報酬出力をもたらすノイズサンプルからの勾配を優先します。我々のアプローチは広く適用可能であり、SDSベースの手法を拡張することができます。特に、RewardVSDを導入することで、変分スコア蒸留（VSD）への適用性を示します。RewardSDSとRewardVSDをテキストから画像生成、2D編集、テキストから3D生成のタスクで評価し、生成品質と所望の報酬モデルへの整合性を測定する多様な指標において、SDSとVSDを大幅に上回る改善を示し、最先端の性能を実現しました。プロジェクトページはhttps://itaychachy.github.io/reward-sds/で公開されています。

English

Score Distillation Sampling (SDS) has emerged as an effective technique for leveraging 2D diffusion priors for tasks such as text-to-3D generation. While powerful, SDS struggles with achieving fine-grained alignment to user intent. To overcome this, we introduce RewardSDS, a novel approach that weights noise samples based on alignment scores from a reward model, producing a weighted SDS loss. This loss prioritizes gradients from noise samples that yield aligned high-reward output. Our approach is broadly applicable and can extend SDS-based methods. In particular, we demonstrate its applicability to Variational Score Distillation (VSD) by introducing RewardVSD. We evaluate RewardSDS and RewardVSD on text-to-image, 2D editing, and text-to-3D generation tasks, showing significant improvements over SDS and VSD on a diverse set of metrics measuring generation quality and alignment to desired reward models, enabling state-of-the-art performance. Project page is available at https://itaychachy. github.io/reward-sds/.

RewardSDS: 報酬重み付きサンプリングによるスコア蒸留のアラインメント

RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling

要旨

Support