ChatPaper.aiChatPaper

mHC:流形約束超連接

mHC: Manifold-Constrained Hyper-Connections

December 31, 2025
作者: Zhenda Xie, Yixuan Wei, Huanqi Cao, Chenggang Zhao, Chengqi Deng, Jiashi Li, Damai Dai, Huazuo Gao, Jiang Chang, Liang Zhao, Shangyan Zhou, Zhean Xu, Zhengyan Zhang, Wangding Zeng, Shengding Hu, Yuqing Wang, Jingyang Yuan, Lean Wang, Wenfeng Liang
cs.AI

摘要

近年來,以超連接(HC)為代表的研究通過擴展殘差流寬度與多樣化連接模式,拓展了過去十年間確立的普適性殘差連接範式。儘管這種多樣化帶來了顯著的性能提升,但從根本上損害了殘差連接固有的恆等映射特性,導致嚴重的訓練不穩定性與受限的可擴展性,並額外產生顯著的記憶體存取開銷。為解決這些挑戰,我們提出流形約束超連接(mHC)——一個將HC的殘差連接空間投影至特定流形以恢復恆等映射特性的通用框架,同時結合嚴格的基礎設施優化以確保效率。實證實驗表明,mHC能有效支持大規模訓練,提供實質性性能提升與更優越的可擴展性。我們預期mHC作為HC的靈活實用擴展,將有助於深化對拓撲架構設計的理解,並為基礎模型的演進指明具有前景的方向。
English
Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifying connectivity patterns. While yielding substantial performance gains, this diversification fundamentally compromises the identity mapping property intrinsic to the residual connection, which causes severe training instability and restricted scalability, and additionally incurs notable memory access overhead. To address these challenges, we propose Manifold-Constrained Hyper-Connections (mHC), a general framework that projects the residual connection space of HC onto a specific manifold to restore the identity mapping property, while incorporating rigorous infrastructure optimization to ensure efficiency. Empirical experiments demonstrate that mHC is effective for training at scale, offering tangible performance improvements and superior scalability. We anticipate that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and suggest promising directions for the evolution of foundational models.
PDF561January 2, 2026