LatentChem:从文本链式思维到化学推理的潜在思考
LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning
February 6, 2026
作者: Xinwu Ye, Yicheng Mao, Jia Zhang, Yimeng Liu, Li Hao, Fang Wu, Zhiwei Li, Yuxuan Liao, Zehong Wang, Zhiyuan Liu, Zhenfei Yin, Li Yuan, Philip Torr, Huan Sun, Xiangxiang Zeng, Mengdi Wang, Le Cong, Shenghua Gao, Xiangru Tang
cs.AI
摘要
当前化学大语言模型主要依赖自然语言的显式思维链进行复杂推理。然而化学推理本质上具有连续性和结构特征,强行将其压缩为离散语言标记会导致表征失配,从而制约效率与性能。我们提出LatentChem——一种潜在推理界面,将化学计算与文本生成解耦,使模型能在连续潜在空间中直接执行多步推理,仅对最终结果进行语言输出。值得注意的是,我们观察到一种持续的涌现现象:当仅针对任务成功率进行优化时,模型会自发内化推理过程,逐步摒弃冗长的文本推导转向隐式的潜在计算。这种转变不仅是风格性的,更具计算优势。在多项化学推理基准测试中,LatentChem在ChemCoTBench上以59.88%的非平局胜率超越基于思维链的强基线模型,同时实现平均10.84倍的推理加速。我们的实验结果实证表明:化学推理通过连续潜在动态实现,比离散化语言轨迹更具自然性与有效性。
English
Chemical large language models (LLMs) predominantly rely on explicit Chain-of-Thought (CoT) in natural language to perform complex reasoning. However, chemical reasoning is inherently continuous and structural, and forcing it into discrete linguistic tokens introduces a fundamental representation mismatch that constrains both efficiency and performance. We introduce LatentChem, a latent reasoning interface that decouples chemical computation from textual generation, enabling models to perform multi-step reasoning directly in continuous latent space while emitting language only for final outputs. Remarkably, we observe a consistent emergent behavior: when optimized solely for task success, models spontaneously internalize reasoning, progressively abandoning verbose textual derivations in favor of implicit latent computation. This shift is not merely stylistic but computationally advantageous. Across diverse chemical reasoning benchmarks, LatentChem achieves a 59.88\% non-tie win rate over strong CoT-based baselines on ChemCoTBench, while delivering a 10.84times average inference speedup. Our results provide empirical evidence that chemical reasoning is more naturally and effectively realized as continuous latent dynamics rather than discretized linguistic trajectories.