ChatPaper.aiChatPaper

複雜度平衡擴散分裂

Complexity-Balanced Diffusion Splitting

June 4, 2026
作者: Noam Issachar, Dani Lischinski, Raanan Fattal
cs.AI

摘要

標準的連續時間生成模型依賴於單一架構,必須處理從各向同性雜訊到複雜數據分佈等截然不同的信號區間。雖然擴展模型容量能提升效能,但將大型網路均勻部署在整個生成時間軸上本質上效率低落。本研究提出複雜度平衡分割(CBS),這是一個基於原則的時序容量分配框架,可將生成工作負載分配給多個專門的子網路。CBS 以函數逼近理論與 de Boor 的等分原理為基礎,將擴散時間軸分割為近似負擔相等的片段,將更多表徵容量分配給生成動態更難建模的區域。為估計此局部複雜度,我們引入兩種互補且易於處理的監測函數:基於流動 Dirichlet 能量的空間度量,以及基於取樣軌跡加速度的幾何度量。透過使用輕量輔助模型估計這些複雜度分佈,我們的方法消除了對啟發式時間分割或計算成本高昂的搜尋程序的需求。跨越多種架構(SiT、JiT 和 UNet)與資料集的廣泛評估顯示,CBS 能在不增加每步推理成本的情況下持續提升合成品質。特別地,在 SiT-XL 搭配 CFG 的設定下,CBS 相較於樸素的時序分割將 FID 改善了約 35%。專案頁面請見 https://noamissachar.github.io/CBS/。
English
Standard continuous-time generative models rely on monolithic architectures that must navigate vastly different signal regimes, from isotropic noise to intricate data distributions. While scaling model capacity improves performance, deploying a massive network uniformly across the entire generative timeline is inherently inefficient. In this work, we propose Complexity-Balanced Splitting (CBS), a principled framework for temporal capacity allocation that distributes the generative workload across multiple specialized sub-networks. Grounded in function approximation theory and de Boor's equidistribution principle, CBS partitions the diffusion timeline into segments of equal approximation burden, allocating more representational capacity to regions where the generative dynamics are more difficult to model. To estimate this local complexity, we introduce two complementary and tractable monitor functions: a spatial measure based on the flow's Dirichlet energy, and a geometric measure based on the acceleration of the sampling trajectories. Using a lightweight auxiliary model to estimate these complexity profiles, our approach eliminates the need for heuristic temporal splits or computationally expensive search procedures. Extensive evaluation across multiple architectures (SiT, JiT, and UNet) and datasets demonstrates that CBS consistently improves synthesis quality without increasing per-step inference cost. In particular, CBS improves FID by ~35% on SiT-XL with CFG relative to naive temporal partitioning. Project page is available at https://noamissachar.github.io/CBS/.