ChatPaper.aiChatPaper

潜在空间:基础、演变、机制、能力与展望

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

April 2, 2026
作者: Xinlei Yu, Zhangquan Chen, Yongbo He, Tianyu Fu, Cheng Yang, Chengming Xu, Yue Ma, Xiaobin Hu, Zhe Cao, Jie Xu, Guibin Zhang, Jiale Tao, Jiayi Zhang, Siyuan Ma, Kaituo Feng, Haojie Huang, Youxing Li, Ronghao Chen, Huacan Wang, Chenglin Wu, Zikun Su, Xiaogang Xu, Kelu Yao, Kun Wang, Chen Gao, Yue Liao, Ruqi Huang, Tao Jin, Cheng Tan, Jiangning Zhang, Wenqi Ren, Yanwei Fu, Yong Liu, Yu Wang, Xiangyu Yue, Yu-Gang Jiang, Shuicheng Yan
cs.AI

摘要

潜在空间正迅速崛起为语言模型的原生计算基质。尽管现代系统通常仍通过显式的词元级生成被理解,但越来越多的研究表明,许多关键内部过程在连续潜在空间中的运行比在人类可读的文本轨迹中更为自然。这一转变源于显式空间计算的结构性局限,包括语言冗余、离散化瓶颈、序列效率低下和语义损失。本文旨在系统梳理语言模型中潜在空间研究的统一图景与最新进展。我们将从基础、演进、机制、能力与展望五个递进视角组织论述:首先界定潜在空间的研究范畴,区分其与显式文本空间及视觉生成模型中潜在空间的本质差异;继而追溯该领域从早期探索到当前大规模拓展的演进脉络。为整合技术生态,我们通过机制与能力双重视角检视现有工作:机制视角聚焦架构设计、表示学习、计算范式与优化策略四大发展方向;能力视角则揭示潜在空间如何支撑推理、规划、建模、感知、记忆、协作与具身化等广泛能力谱系。在整合现有成果的基础上,我们进一步探讨关键开放挑战,并勾勒未来研究的可行路径。期望本综述不仅为现有研究提供参考框架,更助力将潜在空间建构为新一代智能系统的通用计算范式。
English
Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent space in language-based models. We organize the survey into five sequential perspectives: Foundation, Evolution, Mechanism, Ability, and Outlook. We begin by delineating the scope of latent space, distinguishing it from explicit or verbal space and from the latent spaces commonly studied in generative visual models. We then trace the field's evolution from early exploratory efforts to the current large-scale expansion. To organize the technical landscape, we examine existing work through the complementary lenses of mechanism and ability. From the perspective of Mechanism, we identify four major lines of development: Architecture, Representation, Computation, and Optimization. From the perspective of Ability, we show how latent space supports a broad capability spectrum spanning Reasoning, Planning, Modeling, Perception, Memory, Collaboration, and Embodiment. Beyond consolidation, we discuss the key open challenges, and outline promising directions for future research. We hope this survey serves not only as a reference for existing work, but also as a foundation for understanding latent space as a general computational and systems paradigm for next-generation intelligence.
PDF982April 4, 2026