木から流れへ、そして再び戻る：決定木と拡散モデルの統合

要旨

決定木と拡散モデルは、一見すると全く異なるモデルクラスに属する。一方は離散的で階層的、他方は連続的で動的である。本研究では、適切な極限領域において階層的決定木と拡散過程の間に明確な数学的対応関係を確立することで、これら二つを統合する。この統合により、勾配ブースティング（理想化されたバージョン）が漸近的に最適となる共通の最適化原理「グローバル軌跡スコアマッチング（GTSM）」が明らかとなる。我々は、二つの重要な実装例を通じて本研究成果の概念的価値を強調する：表データにおいて競争力のある生成品質を達成し、より高い忠実度と2倍の計算高速化を実現する「TreeFlow」、および階層的決定ロジックをニューラルネットワークに転移する新規蒸留手法「DSMTree」である。後者は多くのベンチマークで教師モデルの性能を2％以内の誤差で再現する。

English

Decision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic. This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes. Our unification reveals a shared optimization principle: Global Trajectory Score Matching (GTSM), for which gradient boosting (in an idealized version) is asymptotically optimal. We underscore the conceptual value of our work through two key practical instantiations: \treeflow, which achieves competitive generation quality on tabular data with higher fidelity and a 2\times computational speedup, and \dsmtree, a novel distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2\% on many benchmarks.

木から流れへ、そして再び戻る：決定木と拡散モデルの統合

Trees to Flows and Back: Unifying Decision Trees and Diffusion Models

要旨

Support