ChatPaper.aiChatPaper

潛在空間:基礎、演進、機制、能力與展望

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

April 2, 2026
作者: Xinlei Yu, Zhangquan Chen, Yongbo He, Tianyu Fu, Cheng Yang, Chengming Xu, Yue Ma, Xiaobin Hu, Zhe Cao, Jie Xu, Guibin Zhang, Jiale Tao, Jiayi Zhang, Siyuan Ma, Kaituo Feng, Haojie Huang, Youxing Li, Ronghao Chen, Huacan Wang, Chenglin Wu, Zikun Su, Xiaogang Xu, Kelu Yao, Kun Wang, Chen Gao, Yue Liao, Ruqi Huang, Tao Jin, Cheng Tan, Jiangning Zhang, Wenqi Ren, Yanwei Fu, Yong Liu, Yu Wang, Xiangyu Yue, Yu-Gang Jiang, Shuicheng Yan
cs.AI

摘要

潛在空間正迅速崛起為語言模型的天然基礎載體。儘管現代系統通常仍被理解為基於顯式詞元層級的生成模式,但越來越多研究表明,許多關鍵內部處理在連續潛在空間中的運行,比在人機可讀的語言軌跡中更為自然。這一轉變源自顯式空間計算的結構性局限,包括語言冗餘、離散化瓶頸、序列效率低下及語義損失等問題。本文旨在系統性梳理語言模型中潛在空間的統一化發展現狀。我們將從基礎、演進、機制、能力與展望這五個維度展開論述:首先界定潛在空間的範疇,釐清其與顯式/語言空間及視覺生成模型中常見潛在空間的區別;接著追溯該領域從早期探索到當前大規模拓展的演進歷程。為構建技術圖譜,我們透過機制與能力這對互補視角審視現有研究:從機制維度歸納出架構、表徵、計算與優化四大發展主線;從能力維度展現在推理、規劃、建模、感知、記憶、協作與具身化等廣泛領域的應用潛力。除整合現有成果外,我們進一步探討關鍵開放性挑戰,並勾勒未來研究的可行路徑。本文不僅可作為現有工作的參考文獻,更期望為理解潛在空間作為新一代智能的通用計算與系統範式奠定基礎。
English
Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent space in language-based models. We organize the survey into five sequential perspectives: Foundation, Evolution, Mechanism, Ability, and Outlook. We begin by delineating the scope of latent space, distinguishing it from explicit or verbal space and from the latent spaces commonly studied in generative visual models. We then trace the field's evolution from early exploratory efforts to the current large-scale expansion. To organize the technical landscape, we examine existing work through the complementary lenses of mechanism and ability. From the perspective of Mechanism, we identify four major lines of development: Architecture, Representation, Computation, and Optimization. From the perspective of Ability, we show how latent space supports a broad capability spectrum spanning Reasoning, Planning, Modeling, Perception, Memory, Collaboration, and Embodiment. Beyond consolidation, we discuss the key open challenges, and outline promising directions for future research. We hope this survey serves not only as a reference for existing work, but also as a foundation for understanding latent space as a general computational and systems paradigm for next-generation intelligence.
PDF982April 4, 2026