ReGuLaR:基于渲染思维链指导的变分潜在推理
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought
January 30, 2026
作者: Fanmeng Wang, Haotian Liu, Guojiang Zhao, Hongteng Xu, Zhifeng Gao
cs.AI
摘要
尽管思维链(CoT)显著提升了大型语言模型(LLM)的性能,但显式推理链会引入大量计算冗余。近期潜在推理方法试图通过将推理过程压缩至隐空间来缓解该问题,但因缺乏合适的压缩指导往往导致性能严重下降。本研究提出渲染式CoT引导变分潜在推理(ReGuLaR),通过一种简洁而新颖的隐式学习范式解决此问题。本质上,我们在变分自编码器(VAE)框架内构建潜在推理模型,从基于先前状态的后验分布中采样当前潜在推理状态。具体而言,在学习该变分潜在推理模型时,我们将显式推理链渲染为图像,并从中提取稠密的视觉语义表征以正则化后验分布,从而实现高效压缩并最小化信息损失。大量实验表明,ReGuLaR在计算效率与推理效能上均显著优于现有潜在推理方法,甚至通过多模态推理超越CoT,为潜在推理提供了具有洞察力的全新解决方案。代码地址:https://github.com/FanmengWang/ReGuLaR。
English
While Chain-of-Thought (CoT) significantly enhances the performance of Large Language Models (LLMs), explicit reasoning chains introduce substantial computational redundancy. Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space, but often suffer from severe performance degradation due to the lack of appropriate compression guidance. In this study, we propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR), a simple yet novel latent learning paradigm resolving this issue. Fundamentally, we formulate latent reasoning within the Variational Auto-Encoding (VAE) framework, sampling the current latent reasoning state from the posterior distribution conditioned on previous ones. Specifically, when learning this variational latent reasoning model, we render explicit reasoning chains as images, from which we extract dense visual-semantic representations to regularize the posterior distribution, thereby achieving efficient compression with minimal information loss. Extensive experiments demonstrate that ReGuLaR significantly outperforms existing latent reasoning methods across both computational efficiency and reasoning effectiveness, and even surpasses CoT through multi-modal reasoning, providing a new and insightful solution to latent reasoning. Code: https://github.com/FanmengWang/ReGuLaR.