ChatPaper.aiChatPaper

DICE:离散反演,用于多项式扩散和遮蔽生成模型的可控编辑

DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models

October 10, 2024
作者: Xiaoxiao He, Ligong Han, Quan Dao, Song Wen, Minhao Bai, Di Liu, Han Zhang, Martin Renqiang Min, Felix Juefei-Xu, Chaowei Tan, Bo Liu, Kang Li, Hongdong Li, Junzhou Huang, Faez Ahmed, Akash Srivastava, Dimitris Metaxas
cs.AI

摘要

离散扩散模型在图像生成和掩蔽语言建模等任务中取得了成功,但在受控内容编辑方面存在局限性。我们引入了DICE(离散反演用于可控编辑),这是第一个能够实现离散扩散模型的精确反演的方法,包括多项式扩散和掩蔽生成模型。通过记录逆扩散过程中的噪声序列和掩蔽模式,DICE实现了对离散数据的准确重构和灵活编辑,无需预定义的掩蔽或注意力操作。我们在图像和文本领域展示了DICE的有效性,对VQ-Diffusion、Paella和RoBERTa等模型进行了评估。我们的结果表明,DICE保持了高数据保真度的同时增强了编辑能力,为离散空间中细粒度内容操作提供了新机会。有关项目网页,请访问https://hexiaoxiao-cs.github.io/DICE/。
English
Discrete diffusion models have achieved success in tasks like image generation and masked language modeling but face limitations in controlled content editing. We introduce DICE (Discrete Inversion for Controllable Editing), the first approach to enable precise inversion for discrete diffusion models, including multinomial diffusion and masked generative models. By recording noise sequences and masking patterns during the reverse diffusion process, DICE enables accurate reconstruction and flexible editing of discrete data without the need for predefined masks or attention manipulation. We demonstrate the effectiveness of DICE across both image and text domains, evaluating it on models such as VQ-Diffusion, Paella, and RoBERTa. Our results show that DICE preserves high data fidelity while enhancing editing capabilities, offering new opportunities for fine-grained content manipulation in discrete spaces. For project webpage, see https://hexiaoxiao-cs.github.io/DICE/.

Summary

AI-Generated Summary

PDF192November 16, 2024