DriftMoE: 개념 변화를 다루기 위한 전문가 혼합 접근법

초록

개념 변화(concept drift)에 노출된 비정적 데이터 스트림에서 학습하기 위해서는 실시간으로 적응하면서도 자원 효율적인 모델이 필요합니다. 기존의 적응형 앙상블 방법들은 대부분 거친 수준의 적응 메커니즘이나 단순한 투표 방식을 사용하여 전문 지식을 최적으로 활용하지 못하는 경우가 많습니다. 본 논문은 이러한 한계를 해결하기 위해 새로운 공동 학습(co-training) 프레임워크를 도입한 온라인 Mixture-of-Experts(MoE) 아키텍처인 DriftMoE를 소개합니다. DriftMoE는 증분적 호프딩 트리(Hoeffding tree) 전문가 풀과 함께 공동 학습되는 컴팩트한 신경망 라우터를 특징으로 합니다. 주요 혁신은 전문가의 전문화를 가능하게 하는 상호 학습 루프에 있습니다: 라우터는 예측에 가장 적합한 전문가를 선택하고, 관련 전문가들은 실제 레이블을 사용해 증분적으로 업데이트하며, 라우터는 모든 정확한 전문가를 강화하는 다중 핫 정확도 마스크(multi-hot correctness mask)를 사용해 매개변수를 개선합니다. 이 피드백 루프는 라우터에 명확한 학습 신호를 제공함과 동시에 전문가의 전문화를 가속화합니다. 우리는 DriftMoE의 성능을 급격한 변화, 점진적 변화, 그리고 실제 세계의 변화를 포함한 9개의 최신 데이터 스트림 학습 벤치마크에서 평가했습니다. 이때 두 가지 구성을 테스트했습니다: 하나는 전문가들이 데이터 체제에 전문화되는 경우(다중 클래스 변형), 다른 하나는 단일 클래스 전문화에 초점을 맞추는 경우(태스크 기반 변형). 실험 결과, DriftMoE는 최신 스트림 학습 적응형 앙상블과 경쟁력 있는 성능을 보이며, 개념 변화 적응에 있어 원칙적이고 효율적인 접근 방식을 제공합니다. 모든 코드, 데이터 파이프라인, 그리고 재현성 스크립트는 공개된 GitHub 저장소에서 확인할 수 있습니다: https://github.com/miguel-ceadar/drift-moe.

English

Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This paper introduces DriftMoE, an online Mixture-of-Experts (MoE) architecture that addresses these limitations through a novel co-training framework. DriftMoE features a compact neural router that is co-trained alongside a pool of incremental Hoeffding tree experts. The key innovation lies in a symbiotic learning loop that enables expert specialization: the router selects the most suitable expert for prediction, the relevant experts update incrementally with the true label, and the router refines its parameters using a multi-hot correctness mask that reinforces every accurate expert. This feedback loop provides the router with a clear training signal while accelerating expert specialization. We evaluate DriftMoE's performance across nine state-of-the-art data stream learning benchmarks spanning abrupt, gradual, and real-world drifts testing two distinct configurations: one where experts specialize on data regimes (multi-class variant), and another where they focus on single-class specialization (task-based variant). Our results demonstrate that DriftMoE achieves competitive results with state-of-the-art stream learning adaptive ensembles, offering a principled and efficient approach to concept drift adaptation. All code, data pipelines, and reproducibility scripts are available in our public GitHub repository: https://github.com/miguel-ceadar/drift-moe.

DriftMoE: 개념 변화를 다루기 위한 전문가 혼합 접근법

DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

초록

Support