ChatPaper.aiChatPaper

SciReasoner:跨学科科学推理基础构建

SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

September 25, 2025
作者: Yizhou Wang, Chen Tang, Han Deng, Jiabei Xiao, Jiaqi Liu, Jianyu Wu, Jun Yao, Pengze Li, Encheng Su, Lintao Wang, Guohang Zhuang, Yuchen Ren, Ben Fei, Ming Hu, Xin Chen, Dongzhan Zhou, Junjun He, Xiangyu Yue, Zhenfei Yin, Jiamin Wu, Qihao Zheng, Yuhao Zhou, Huihui Xu, Chenglong Ma, Yan Lu, Wenlong Zhang, Chunfeng Song, Philip Torr, Shixiang Tang, Xinzhu Ma, Wanli Ouyang, Lei Bai
cs.AI

摘要

我们提出了一种科学推理基础模型,该模型将自然语言与异构的科学表示对齐。该模型在包含科学文本、纯序列及序列-文本对的206B标记语料库上进行预训练,随后通过40M指令的监督微调(SFT)进行对齐,采用退火冷启动引导以激发长链思维,并结合任务特定的奖励塑造进行强化学习,从而培养出深思熟虑的科学推理能力。该模型支持四大能力家族,覆盖工作流中的多达103项任务:(i) 文本与科学格式间的忠实转换,(ii) 文本/知识抽取,(iii) 属性预测,(iv) 属性分类,(v) 无条件与条件序列生成及设计。相较于专业系统,我们的方法拓宽了指令覆盖范围,提升了跨领域泛化能力,并增强了保真度。我们详细阐述了数据整理与训练过程,并展示了跨学科学习如何强化迁移与下游任务的可靠性。该模型、指令调优数据集及评估代码已开源,访问地址为https://huggingface.co/SciReason 和 https://github.com/open-sciencelab/SciReason。
English
We present a scientific reasoning foundation model that aligns natural language with heterogeneous scientific representations. The model is pretrained on a 206B-token corpus spanning scientific text, pure sequences, and sequence-text pairs, then aligned via SFT on 40M instructions, annealed cold-start bootstrapping to elicit long-form chain-of-thought, and reinforcement learning with task-specific reward shaping, which instills deliberate scientific reasoning. It supports four capability families, covering up to 103 tasks across workflows: (i) faithful translation between text and scientific formats, (ii) text/knowledge extraction, (iii) property prediction, (iv) property classification, (v) unconditional and conditional sequence generation and design. Compared with specialist systems, our approach broadens instruction coverage, improves cross-domain generalization, and enhances fidelity. We detail data curation and training and show that cross-discipline learning strengthens transfer and downstream reliability. The model, instruct tuning datasets and the evaluation code are open-sourced at https://huggingface.co/SciReason and https://github.com/open-sciencelab/SciReason.
PDF912September 26, 2025