ChatPaper.aiChatPaper

离散流匹配

Discrete Flow Matching

July 22, 2024
作者: Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen, Gabriel Synnaeve, Yossi Adi, Yaron Lipman
cs.AI

摘要

尽管流匹配(Flow Matching)和扩散模型已经成为连续变量(如图像和视频)的强大生成范式,但它们在高维离散数据(如语言)上的应用仍然有限。在这项工作中,我们提出了离散流匹配(Discrete Flow Matching),这是一种专门设计用于生成离散数据的新颖离散流范式。离散流匹配提供了几个关键贡献:(i) 它适用于一般的概率路径族,插值源分布和目标分布之间的路径;(ii) 它允许使用学习后验概率(如概率去噪器(x-预测)和噪声预测(epsilon-预测))从这些概率路径中采样的通用公式;(iii) 在实践中,专注于使用不同调度程序定义的特定概率路径,与以前的离散扩散和流模型相比,显著改善了生成困惑度;(iv) 通过将离散流匹配模型扩展到17亿参数,我们在HumanEval上达到了6.7% Pass@1和13.4% Pass@10,在1-shot MBPP编码基准上达到了6.7% Pass@1和20.6% Pass@10。我们的方法能够以非自回归方式生成高质量的离散数据,显著缩小了自回归模型和离散流模型之间的差距。
English
Despite Flow Matching and diffusion models having emerged as powerful generative paradigms for continuous variables such as images and videos, their application to high-dimensional discrete data, such as language, is still limited. In this work, we present Discrete Flow Matching, a novel discrete flow paradigm designed specifically for generating discrete data. Discrete Flow Matching offers several key contributions: (i) it works with a general family of probability paths interpolating between source and target distributions; (ii) it allows for a generic formula for sampling from these probability paths using learned posteriors such as the probability denoiser (x-prediction) and noise-prediction (epsilon-prediction); (iii) practically, focusing on specific probability paths defined with different schedulers considerably improves generative perplexity compared to previous discrete diffusion and flow models; and (iv) by scaling Discrete Flow Matching models up to 1.7B parameters, we reach 6.7% Pass@1 and 13.4% Pass@10 on HumanEval and 6.7% Pass@1 and 20.6% Pass@10 on 1-shot MBPP coding benchmarks. Our approach is capable of generating high-quality discrete data in a non-autoregressive fashion, significantly closing the gap between autoregressive models and discrete flow models.

Summary

AI-Generated Summary

PDF132November 28, 2024