離散流匹配
Discrete Flow Matching
July 22, 2024
作者: Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen, Gabriel Synnaeve, Yossi Adi, Yaron Lipman
cs.AI
摘要
儘管流匹配(Flow Matching)和擴散模型已經成為連續變數(如圖像和視頻)的強大生成範式,但它們在高維離散數據(如語言)上的應用仍然有限。在這項研究中,我們提出了離散流匹配(Discrete Flow Matching),這是一種專門設計用於生成離散數據的新穎離散流範式。離散流匹配提供了幾個關鍵貢獻:(i) 它與一般的概率路徑家族一起工作,插值源分佈和目標分佈之間;(ii) 它允許使用學習的後驗概率(如概率去噪器(x-預測)和噪聲預測(epsilon-預測))從這些概率路徑中進行抽樣的通用公式;(iii) 實際上,專注於使用不同調度器定義的特定概率路徑,與先前的離散擴散和流模型相比,顯著改善了生成困惑度;(iv) 通過將離散流匹配模型擴展到17億參數,我們在HumanEval上達到了6.7% Pass@1和13.4% Pass@10,在1-shot MBPP編碼基準上達到了6.7% Pass@1和20.6% Pass@10。我們的方法能夠以非自回歸方式生成高質量的離散數據,顯著縮小了自回歸模型和離散流模型之間的差距。
English
Despite Flow Matching and diffusion models having emerged as powerful
generative paradigms for continuous variables such as images and videos, their
application to high-dimensional discrete data, such as language, is still
limited. In this work, we present Discrete Flow Matching, a novel discrete flow
paradigm designed specifically for generating discrete data. Discrete Flow
Matching offers several key contributions: (i) it works with a general family
of probability paths interpolating between source and target distributions;
(ii) it allows for a generic formula for sampling from these probability paths
using learned posteriors such as the probability denoiser (x-prediction) and
noise-prediction (epsilon-prediction); (iii) practically, focusing on
specific probability paths defined with different schedulers considerably
improves generative perplexity compared to previous discrete diffusion and flow
models; and (iv) by scaling Discrete Flow Matching models up to 1.7B
parameters, we reach 6.7% Pass@1 and 13.4% Pass@10 on HumanEval and 6.7% Pass@1
and 20.6% Pass@10 on 1-shot MBPP coding benchmarks. Our approach is capable of
generating high-quality discrete data in a non-autoregressive fashion,
significantly closing the gap between autoregressive models and discrete flow
models.Summary
AI-Generated Summary