ComboStoc：擴散生成模型的組合隨機性

摘要

本文探討擴散生成模型中一個尚未被充分研究但至關重要的因素——組合複雜性。數據樣本通常具有高維特性，且在各種結構化生成任務中，常需結合額外屬性與數據樣本建立關聯。我們發現現有擴散生成模型的訓練方案可能無法充分覆蓋維度與屬性組合所構成的空間，這可能限制模型在測試階段的表現。針對此問題，我們提出一種簡潔的解決方案：通過構建能充分利用組合結構的隨機過程（故名為ComboStoc）來優化訓練。採用此策略後，我們在圖像和3D結構化形狀等多種數據模態中均觀察到網絡訓練速度的顯著提升。此外，ComboStoc開創了一種新的測試時生成方式，可對不同維度與屬性採用非同步時間步長，從而實現對這些要素的差異化控制。我們的代碼已開源於：https://github.com/Xrvitd/ComboStoc

English

In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, additional attributes are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes can be insufficiently covered by existing training schemes of diffusion generative models, potentially limiting test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses asynchronous time steps for different dimensions and attributes, thus allowing for varying degrees of control over them. Our code is available at: https://github.com/Xrvitd/ComboStoc

ComboStoc：擴散生成模型的組合隨機性

ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

摘要

Support