SciReasoner:跨學科奠定科學推理基礎
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
September 25, 2025
作者: Yizhou Wang, Chen Tang, Han Deng, Jiabei Xiao, Jiaqi Liu, Jianyu Wu, Jun Yao, Pengze Li, Encheng Su, Lintao Wang, Guohang Zhuang, Yuchen Ren, Ben Fei, Ming Hu, Xin Chen, Dongzhan Zhou, Junjun He, Xiangyu Yue, Zhenfei Yin, Jiamin Wu, Qihao Zheng, Yuhao Zhou, Huihui Xu, Chenglong Ma, Yan Lu, Wenlong Zhang, Chunfeng Song, Philip Torr, Shixiang Tang, Xinzhu Ma, Wanli Ouyang, Lei Bai
cs.AI
摘要
我們提出了一種科學推理基礎模型,該模型將自然語言與異質科學表徵對齊。此模型在一個包含2060億個標記的語料庫上進行預訓練,該語料庫涵蓋科學文本、純序列及序列-文本對,隨後通過40M指令的監督微調(SFT)進行對齊,採用退火冷啟動引導以激發長鏈思維,並結合任務特定的獎勵塑形進行強化學習,從而培養出深思熟慮的科學推理能力。它支持四大能力家族,覆蓋多達103個工作流任務:(i) 文本與科學格式間的忠實轉換,(ii) 文本/知識提取,(iii) 屬性預測,(iv) 屬性分類,(v) 無條件與有條件序列生成與設計。與專業系統相比,我們的方法擴展了指令覆蓋範圍,提升了跨領域泛化能力,並增強了保真度。我們詳細介紹了數據整理與訓練過程,並展示了跨學科學習如何強化遷移與下游可靠性。該模型、指令微調數據集及評估代碼已開源於 https://huggingface.co/SciReason 和 https://github.com/open-sciencelab/SciReason。
English
We present a scientific reasoning foundation model that aligns natural
language with heterogeneous scientific representations. The model is pretrained
on a 206B-token corpus spanning scientific text, pure sequences, and
sequence-text pairs, then aligned via SFT on 40M instructions, annealed
cold-start bootstrapping to elicit long-form chain-of-thought, and
reinforcement learning with task-specific reward shaping, which instills
deliberate scientific reasoning. It supports four capability families, covering
up to 103 tasks across workflows: (i) faithful translation between text and
scientific formats, (ii) text/knowledge extraction, (iii) property prediction,
(iv) property classification, (v) unconditional and conditional sequence
generation and design. Compared with specialist systems, our approach broadens
instruction coverage, improves cross-domain generalization, and enhances
fidelity. We detail data curation and training and show that cross-discipline
learning strengthens transfer and downstream reliability. The model, instruct
tuning datasets and the evaluation code are open-sourced at
https://huggingface.co/SciReason and
https://github.com/open-sciencelab/SciReason.