ChatPaper.aiChatPaper

離散流匹配

Discrete Flow Matching

July 22, 2024
作者: Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen, Gabriel Synnaeve, Yossi Adi, Yaron Lipman
cs.AI

摘要

儘管流匹配(Flow Matching)和擴散模型已經成為連續變數(如圖像和視頻)的強大生成範式,但它們在高維離散數據(如語言)上的應用仍然有限。在這項研究中,我們提出了離散流匹配(Discrete Flow Matching),這是一種專門設計用於生成離散數據的新穎離散流範式。離散流匹配提供了幾個關鍵貢獻:(i) 它與一般的概率路徑家族一起工作,插值源分佈和目標分佈之間;(ii) 它允許使用學習的後驗概率(如概率去噪器(x-預測)和噪聲預測(epsilon-預測))從這些概率路徑中進行抽樣的通用公式;(iii) 實際上,專注於使用不同調度器定義的特定概率路徑,與先前的離散擴散和流模型相比,顯著改善了生成困惑度;(iv) 通過將離散流匹配模型擴展到17億參數,我們在HumanEval上達到了6.7% Pass@1和13.4% Pass@10,在1-shot MBPP編碼基準上達到了6.7% Pass@1和20.6% Pass@10。我們的方法能夠以非自回歸方式生成高質量的離散數據,顯著縮小了自回歸模型和離散流模型之間的差距。
English
Despite Flow Matching and diffusion models having emerged as powerful generative paradigms for continuous variables such as images and videos, their application to high-dimensional discrete data, such as language, is still limited. In this work, we present Discrete Flow Matching, a novel discrete flow paradigm designed specifically for generating discrete data. Discrete Flow Matching offers several key contributions: (i) it works with a general family of probability paths interpolating between source and target distributions; (ii) it allows for a generic formula for sampling from these probability paths using learned posteriors such as the probability denoiser (x-prediction) and noise-prediction (epsilon-prediction); (iii) practically, focusing on specific probability paths defined with different schedulers considerably improves generative perplexity compared to previous discrete diffusion and flow models; and (iv) by scaling Discrete Flow Matching models up to 1.7B parameters, we reach 6.7% Pass@1 and 13.4% Pass@10 on HumanEval and 6.7% Pass@1 and 20.6% Pass@10 on 1-shot MBPP coding benchmarks. Our approach is capable of generating high-quality discrete data in a non-autoregressive fashion, significantly closing the gap between autoregressive models and discrete flow models.

Summary

AI-Generated Summary

PDF132November 28, 2024