KAN:科爾莫哥洛夫-阿諾德網絡
KAN: Kolmogorov-Arnold Networks
April 30, 2024
作者: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark
cs.AI
摘要
受 Kolmogorov-Arnold 表示定理的啟發,我們提出 Kolmogorov-Arnold 網路(KANs)作為多層感知器(MLPs)的有希望的替代方案。MLPs 在節點("神經元")上具有固定的激活函數,而 KANs 在邊緣("權重")上具有可學習的激活函數。KANs 完全沒有線性權重 - 每個權重參數都被參數化為樣條函數的單變量函數取代。我們展示了這看似簡單的改變使 KANs 在準確性和可解釋性方面勝過 MLPs。在準確性方面,比較小的 KANs 在數據擬合和 PDE 求解方面可以實現與比較大的 MLPs 相當或更好的準確性。從理論和實證來看,KANs 具有比 MLPs 更快的神經擴展規律。在可解釋性方面,KANs 可以直觀地可視化並且可以輕鬆地與人類用戶互動。通過數學和物理領域的兩個示例,我們展示了 KANs 可以成為有用的合作者,幫助科學家(重新)發現數學和物理定律。總之,KANs 是 MLPs 的有希望替代方案,為進一步改進今天嚴重依賴 MLPs 的深度學習模型開啟了機會。
English
Inspired by the Kolmogorov-Arnold representation theorem, we propose
Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer
Perceptrons (MLPs). While MLPs have fixed activation functions on nodes
("neurons"), KANs have learnable activation functions on edges ("weights").
KANs have no linear weights at all -- every weight parameter is replaced by a
univariate function parametrized as a spline. We show that this seemingly
simple change makes KANs outperform MLPs in terms of accuracy and
interpretability. For accuracy, much smaller KANs can achieve comparable or
better accuracy than much larger MLPs in data fitting and PDE solving.
Theoretically and empirically, KANs possess faster neural scaling laws than
MLPs. For interpretability, KANs can be intuitively visualized and can easily
interact with human users. Through two examples in mathematics and physics,
KANs are shown to be useful collaborators helping scientists (re)discover
mathematical and physical laws. In summary, KANs are promising alternatives for
MLPs, opening opportunities for further improving today's deep learning models
which rely heavily on MLPs.Summary
AI-Generated Summary