從樹到流與回歸:決策樹與擴散模型的統一框架 This translation maintains the academic tone while: - Using standard AI terminology ("決策樹" for decision trees, "擴散模型" for diffusion models) - Creating a natural Chinese phrasing with "從...到..." structure - Preserving the conceptual contrast between discrete structures (trees) and continuous processes (flows) - Keeping the bidirectional nature ("and Back") with "回歸" - Maintaining the unifying theme with "統一框架"
Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
May 1, 2026
作者: Sai Niranjan Ramachandran, Suvrit Sra
cs.AI
摘要
決策樹與擴散模型表面上是截然不同的模型類別:前者離散且具層級結構,後者連續且動態。本研究透過在適當極限條件下建立層級決策樹與擴散過程的精確數學對應,實現了兩者的統一。我們的統一框架揭示了一個共享的優化原理:全局軌跡分數匹配(GTSM),而梯度提升(在理想化版本中)對該原理具有漸進最優性。我們通過兩個關鍵的實際應用實例彰顯本研究的理論價值:\treeflow 在表格數據生成任務中實現了競爭性的生成質量,同時具備更高保真度和 2 倍計算加速;\dsmtree 作為新型蒸餾方法,能將層級決策邏輯遷移至神經網絡,在多項基準測試中與教師模型性能差距小於 2\%。
English
Decision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic. This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes. Our unification reveals a shared optimization principle: Global Trajectory Score Matching (GTSM), for which gradient boosting (in an idealized version) is asymptotically optimal. We underscore the conceptual value of our work through two key practical instantiations: \treeflow, which achieves competitive generation quality on tabular data with higher fidelity and a 2\times computational speedup, and \dsmtree, a novel distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2\% on many benchmarks.