SciReasoner：跨學科奠定科學推理基礎

摘要

我們提出了一種科學推理基礎模型，該模型將自然語言與異質科學表徵對齊。此模型在一個包含2060億個標記的語料庫上進行預訓練，該語料庫涵蓋科學文本、純序列及序列-文本對，隨後通過40M指令的監督微調（SFT）進行對齊，採用退火冷啟動引導以激發長鏈思維，並結合任務特定的獎勵塑形進行強化學習，從而培養出深思熟慮的科學推理能力。它支持四大能力家族，覆蓋多達103個工作流任務：(i) 文本與科學格式間的忠實轉換，(ii) 文本/知識提取，(iii) 屬性預測，(iv) 屬性分類，(v) 無條件與有條件序列生成與設計。與專業系統相比，我們的方法擴展了指令覆蓋範圍，提升了跨領域泛化能力，並增強了保真度。我們詳細介紹了數據整理與訓練過程，並展示了跨學科學習如何強化遷移與下游可靠性。該模型、指令微調數據集及評估代碼已開源於 https://huggingface.co/SciReason 和 https://github.com/open-sciencelab/SciReason。

English

We present a scientific reasoning foundation model that aligns natural language with heterogeneous scientific representations. The model is pretrained on a 206B-token corpus spanning scientific text, pure sequences, and sequence-text pairs, then aligned via SFT on 40M instructions, annealed cold-start bootstrapping to elicit long-form chain-of-thought, and reinforcement learning with task-specific reward shaping, which instills deliberate scientific reasoning. It supports four capability families, covering up to 103 tasks across workflows: (i) faithful translation between text and scientific formats, (ii) text/knowledge extraction, (iii) property prediction, (iv) property classification, (v) unconditional and conditional sequence generation and design. Compared with specialist systems, our approach broadens instruction coverage, improves cross-domain generalization, and enhances fidelity. We detail data curation and training and show that cross-discipline learning strengthens transfer and downstream reliability. The model, instruct tuning datasets and the evaluation code are open-sourced at https://huggingface.co/SciReason and https://github.com/open-sciencelab/SciReason.

SciReasoner：跨學科奠定科學推理基礎

SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

摘要

Support