MLP-KAN: 深層表現と関数学習の統合

要旨

最近の表現学習と関数学習の両方の進歩は、人工知能のさまざまな領域で大きな可能性を示しています。ただし、これらのパラダイムを効果的に統合することは、特にユーザーがデータセットの特性に基づいて表現学習モデルまたは関数学習モデルを適用するかどうかを手動で決定する必要がある場合には、重要な課題です。この問題に対処するために、手動モデル選択の必要性を排除するために設計された統一された手法であるMLP-KANを紹介します。多層パーセプトロン（MLPs）を表現学習のために、コルモゴロフ・アーノルド・ネットワーク（KANs）を関数学習のために、専門家の混合（MoE）アーキテクチャ内に統合することにより、MLP-KANは特定のタスクの特性に動的に適応し、最適なパフォーマンスを確保します。トランスフォーマーベースのフレームワークに組み込まれた当該研究は、さまざまな領域で広く使用されている4つのデータセットで顕著な結果を達成しています。包括的な実験評価により、MLP-KANは、深い表現学習および関数学習タスクの両方で競争力のあるパフォーマンスを提供する優れた柔軟性を実証しています。これらの知見は、MLP-KANのモデル選択プロセスを簡素化し、さまざまな領域で包括的かつ適応可能なソリューションを提供する潜在能力を強調しています。当該研究のコードと重みは、https://github.com/DLYuanGod/MLP-KAN で入手可能です。

English

Recent advancements in both representation learning and function learning have demonstrated substantial promise across diverse domains of artificial intelligence. However, the effective integration of these paradigms poses a significant challenge, particularly in cases where users must manually decide whether to apply a representation learning or function learning model based on dataset characteristics. To address this issue, we introduce MLP-KAN, a unified method designed to eliminate the need for manual model selection. By integrating Multi-Layer Perceptrons (MLPs) for representation learning and Kolmogorov-Arnold Networks (KANs) for function learning within a Mixture-of-Experts (MoE) architecture, MLP-KAN dynamically adapts to the specific characteristics of the task at hand, ensuring optimal performance. Embedded within a transformer-based framework, our work achieves remarkable results on four widely-used datasets across diverse domains. Extensive experimental evaluation demonstrates its superior versatility, delivering competitive performance across both deep representation and function learning tasks. These findings highlight the potential of MLP-KAN to simplify the model selection process, offering a comprehensive, adaptable solution across various domains. Our code and weights are available at https://github.com/DLYuanGod/MLP-KAN.

MLP-KAN: 深層表現と関数学習の統合

MLP-KAN: Unifying Deep Representation and Function Learning

要旨

Support