ComboStoc：拡散生成モデルのための組み合わせ確率性

要旨

本論文では、拡散生成モデルの未開拓ながら重要な要因、すなわち組み合わせ複雑性について検討する。データサンプルは一般に高次元であり、様々な構造化生成タスクでは、データサンプルに関連付けるために追加の属性が組み合わされる。我々は、次元と属性の組み合わせによって張られる空間が、既存の拡散生成モデルの学習スキームでは不十分にカバーされる可能性があり、テスト時の性能を制限しうることを示す。この問題に対する単純な修正法として、組み合わせ構造を完全に活用する確率過程を構築する手法を提案する（因此命名为ComboStoc）。この簡潔な戦略により、画像や3D構造化形状など多様なデータモダリティにわたって、ネットワーク学習が大幅に加速されることを実証する。さらにComboStocは、異なる次元と属性に対して非同期の時間ステップを使用する新たなテスト時生成手法を可能にし、それらに対する制御の度合いを可変にする。コードは以下で公開されている：https://github.com/Xrvitd/ComboStoc

English

In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, additional attributes are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes can be insufficiently covered by existing training schemes of diffusion generative models, potentially limiting test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses asynchronous time steps for different dimensions and attributes, thus allowing for varying degrees of control over them. Our code is available at: https://github.com/Xrvitd/ComboStoc

ComboStoc：拡散生成モデルのための組み合わせ確率性

ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

要旨

Support