MLP-KAN: 统一深度表示和函数学习
MLP-KAN: Unifying Deep Representation and Function Learning
October 3, 2024
作者: Yunhong He, Yifeng Xie, Zhengqing Yuan, Lichao Sun
cs.AI
摘要
最近在表示学习和函数学习方面取得的进展在人工智能的各个领域展现出了巨大的潜力。然而,这些范式的有效整合构成了一个重大挑战,特别是在用户必须根据数据集特征手动决定是否应用表示学习或函数学习模型的情况下。为了解决这个问题,我们引入了MLP-KAN,这是一种旨在消除手动模型选择需求的统一方法。通过在一个专家混合模型中集成多层感知器(MLPs)进行表示学习和科尔莫戈洛夫-阿诺德网络(KANs)进行函数学习,MLP-KAN能够动态适应当前任务的特定特征,确保最佳性能。嵌入到基于Transformer的框架中,我们的工作在各个领域的四个广泛使用的数据集上取得了显著的成果。广泛的实验评估显示了其卓越的通用性,为深度表示学习和函数学习任务提供了竞争性的性能。这些发现突显了MLP-KAN简化模型选择过程的潜力,为各个领域提供了全面、可适应的解决方案。我们的代码和权重可在https://github.com/DLYuanGod/MLP-KAN 上获得。
English
Recent advancements in both representation learning and function learning
have demonstrated substantial promise across diverse domains of artificial
intelligence. However, the effective integration of these paradigms poses a
significant challenge, particularly in cases where users must manually decide
whether to apply a representation learning or function learning model based on
dataset characteristics. To address this issue, we introduce MLP-KAN, a unified
method designed to eliminate the need for manual model selection. By
integrating Multi-Layer Perceptrons (MLPs) for representation learning and
Kolmogorov-Arnold Networks (KANs) for function learning within a
Mixture-of-Experts (MoE) architecture, MLP-KAN dynamically adapts to the
specific characteristics of the task at hand, ensuring optimal performance.
Embedded within a transformer-based framework, our work achieves remarkable
results on four widely-used datasets across diverse domains. Extensive
experimental evaluation demonstrates its superior versatility, delivering
competitive performance across both deep representation and function learning
tasks. These findings highlight the potential of MLP-KAN to simplify the model
selection process, offering a comprehensive, adaptable solution across various
domains. Our code and weights are available at
https://github.com/DLYuanGod/MLP-KAN.Summary
AI-Generated Summary