多樣性獎勵的CFG蒸餾

摘要

生成模型正在改變創意領域，如音樂生成，推理時間策略，如無分類器引導（CFG）發揮了至關重要的作用。然而，CFG會使推理成本加倍，同時限制生成內容的原創性和多樣性。本文介紹了多樣性獎勵的CFG蒸餾，這是一種新穎的微調程序，旨在提煉CFG的優勢，同時解決其局限性。我們的方法優化了兩個訓練目標：（1）蒸餾目標，鼓勵模型單獨（無需CFG）模仿CFG增強的預測，以及（2）帶有多樣性獎勵的RL目標，促進對給定提示生成多樣性輸出。通過微調，我們學習了具有生成高質量和多樣性輸出能力的模型權重，而無需進行任何推理開銷。這也開啟了基於權重的模型合併策略的潛力：通過在兩個模型的權重之間插值（第一個專注於質量，第二個專注於多樣性），我們可以在部署時控制質量-多樣性的權衡，甚至進一步提高性能。我們在MusicLM（Agostinelli等，2023）文本到音樂生成模型上進行了大量實驗，我們的方法在質量-多樣性帕累托最優方面超越了CFG。根據人類評估者的說法，我們微調後合併的模型生成的樣本在質量-多樣性方面優於基於CFG增強的基本模型。探索我們的生成：https://google-research.github.io/seanet/musiclm/diverse_music/。

English

Generative models are transforming creative domains such as music generation, with inference-time strategies like Classifier-Free Guidance (CFG) playing a crucial role. However, CFG doubles inference cost while limiting originality and diversity across generated contents. In this paper, we introduce diversity-rewarded CFG distillation, a novel finetuning procedure that distills the strengths of CFG while addressing its limitations. Our approach optimises two training objectives: (1) a distillation objective, encouraging the model alone (without CFG) to imitate the CFG-augmented predictions, and (2) an RL objective with a diversity reward, promoting the generation of diverse outputs for a given prompt. By finetuning, we learn model weights with the ability to generate high-quality and diverse outputs, without any inference overhead. This also unlocks the potential of weight-based model merging strategies: by interpolating between the weights of two models (the first focusing on quality, the second on diversity), we can control the quality-diversity trade-off at deployment time, and even further boost performance. We conduct extensive experiments on the MusicLM (Agostinelli et al., 2023) text-to-music generative model, where our approach surpasses CFG in terms of quality-diversity Pareto optimality. According to human evaluators, our finetuned-then-merged model generates samples with higher quality-diversity than the base model augmented with CFG. Explore our generations at https://google-research.github.io/seanet/musiclm/diverse_music/.

多樣性獎勵的CFG蒸餾

Diversity-Rewarded CFG Distillation

摘要

Summary

Support