ChatPaper.aiChatPaper

Sci-CoE:基于几何共识与稀疏监督协同演进的科学推理大模型

Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

February 12, 2026
作者: Xiaohan He, Shiyang Feng, Songtao Huang, Lei Bai, Bin Wang, Bo Zhang
cs.AI

摘要

大型语言模型(LLMs)已展现出卓越的推理能力,协同演化范式在代码和数学等领域取得了显著成效。然而在科学推理任务中,由于解决方案评估的不可靠性及验证策略的单一性,这些模型仍显脆弱。本研究提出Sci-CoE——一个两阶段科学协同演化框架,通过从稀疏监督到无监督学习的过渡,使模型能够以求解器和验证器的双重身份实现自我演化。第一阶段,模型利用少量标注数据为验证器建立基础的正确性判断锚点;第二阶段,我们引入同时考量共识度、可靠性与多样性的几何奖励机制,驱动模型在无标注数据上进行大规模自迭代。在多个通用科学基准测试上的实验表明,Sci-CoE能有效增强复杂推理能力,并展现出强大的可扩展性,为构建更鲁棒且多元的评估体系提供了新路径。代码已开源:https://github.com/InternScience/Sci-CoE。
English
Large language models (LLMs) have demonstrated exceptional reasoning capabilities, and co-evolving paradigms have shown promising results in domains such as code and math. However, in scientific reasoning tasks, these models remain fragile due to unreliable solution evaluation and limited diversity in verification strategies. In this work, we propose Sci-CoE, a two-stage scientific co-evolving framework that enables models to self-evolve as both solver and verifier through a transition from sparse supervision to unsupervised learning. In the first stage, the model uses a small set of annotated data to establish fundamental correctness judgment anchors for the Verifier. In the second stage, we introduce a geometric reward mechanism that jointly considers consensus, reliability, and diversity, driving large-scale self-iteration on unlabeled data. Experiments on several general scientific benchmarks demonstrate that Sci-CoE enhances complex reasoning capabilities and exhibits strong scalability, facilitating the construction of more robust and diverse evaluation systems. Codes are available at https://github.com/InternScience/Sci-CoE.
PDF21February 14, 2026