나무에서 흐름으로 그리고 다시: 의사결정 나무와 확산 모델의 통합

초록

의사결정나무와 확산 모델은 겉보기에는 서로 이질적인 모델 클래스로, 하나는 이산적이고 계층적인 반면 다른 하나는 연속적이고 동적입니다. 본 연구는 적절한 극한 체계에서 계층적 의사결정나무와 확산 과정 사이의 명확한 수학적 대응 관계를 수립하여 두 모델을 통합합니다. 우리의 통합은 공유된 최적화 원리인 Global Trajectory Score Matching(GTSM)을 밝혀내며, 이에 대해 그래디언트 부스팅(이상화된 버전)이 점근적으로 최적임을 보입니다. 우리는 두 가지 핵심 실제 적용을 통해 본 연구의 개념적 가치를 부각합니다: 표 형식 데이터에서 경쟁력 있는 생성 품질을 달성하며 더 높은 정확도와 2배의 계산 속도 향상을 보이는 \treeflow, 그리고 계층적 의사결정 논리를 신경망으로 이전하는 새로운 지식 증류 방법인 \dsmtree는 여러 벤치마크에서 교사 모델 성능의 2% 이내로 달성합니다.

English

Decision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic. This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes. Our unification reveals a shared optimization principle: Global Trajectory Score Matching (GTSM), for which gradient boosting (in an idealized version) is asymptotically optimal. We underscore the conceptual value of our work through two key practical instantiations: \treeflow, which achieves competitive generation quality on tabular data with higher fidelity and a 2\times computational speedup, and \dsmtree, a novel distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2\% on many benchmarks.

나무에서 흐름으로 그리고 다시: 의사결정 나무와 확산 모델의 통합

Trees to Flows and Back: Unifying Decision Trees and Diffusion Models

초록

Support