在推理時重新調整離散擴散模型的掩碼策略

摘要

擴散模型的部分成功源於其執行迭代精煉的能力，即在生成過程中反覆修正輸出。然而，現代掩碼離散擴散模型缺乏這一能力：一旦生成一個標記，即使它引入了錯誤，也無法再次更新。在此，我們通過引入重掩碼擴散模型（ReMDM）採樣器來解決這一限制，這是一種可以以原則性方式應用於預訓練掩碼擴散模型的方法，並且源自於具有自定義重掩碼反向過程的離散擴散模型。最有趣的是，ReMDM賦予離散擴散模型一種推理時計算規模化的形式。通過增加採樣步驟的數量，ReMDM生成的自然語言輸出接近自回歸模型的質量，而在計算預算有限時，ReMDM能更好地保持質量。ReMDM還提高了掩碼擴散模型在離散化圖像上的樣本質量，並且在分子設計等科學領域中，ReMDM促進了擴散指導，並相對於經典掩碼和均勻噪聲擴散，推動了可控性的帕累托前沿。我們在項目頁面上提供了代碼以及一篇博客文章：https://remdm.github.io。

English

Part of the success of diffusion models stems from their ability to perform iterative refinement, i.e., repeatedly correcting outputs during generation. However, modern masked discrete diffusion lacks this capability: when a token is generated, it cannot be updated again, even when it introduces an error. Here, we address this limitation by introducing the remasking diffusion model (ReMDM) sampler, a method that can be applied to pretrained masked diffusion models in a principled way and that is derived from a discrete diffusion model with a custom remasking backward process. Most interestingly, ReMDM endows discrete diffusion with a form of inference-time compute scaling. By increasing the number of sampling steps, ReMDM generates natural language outputs that approach the quality of autoregressive models, whereas when the computation budget is limited, ReMDM better maintains quality. ReMDM also improves sample quality of masked diffusion models for discretized images, and in scientific domains such as molecule design, ReMDM facilitates diffusion guidance and pushes the Pareto frontier of controllability relative to classical masking and uniform noise diffusion. We provide the code along with a blog post on the project page: https://remdm.github.io.

在推理時重新調整離散擴散模型的掩碼策略

Remasking Discrete Diffusion Models with Inference-Time Scaling

摘要

Support