SciReasoner: 학제 간 과학적 추론 기반 구축

초록

우리는 자연어와 이질적인 과학적 표현을 정렬하는 과학적 추론 기반 모델을 제시합니다. 이 모델은 과학 텍스트, 순수 시퀀스, 그리고 시퀀스-텍스트 쌍으로 구성된 206B 토큰 규모의 코퍼스로 사전 학습되었으며, 40M 개의 지시 사항을 통해 SFT(Supervised Fine-Tuning)로 정렬되었습니다. 또한, 장문의 사고 사슬(chain-of-thought)을 유도하기 위해 냉간 시작 부트스트래핑(annealed cold-start bootstrapping)을 적용하고, 과제별 보상 형성(reward shaping)을 통한 강화 학습을 통해 의도적인 과학적 추론 능력을 함양했습니다. 이 모델은 (i) 텍스트와 과학적 형식 간의 충실한 변환, (ii) 텍스트/지식 추출, (iii) 속성 예측, (iv) 속성 분류, (v) 무조건 및 조건부 시퀀스 생성 및 설계를 포함한 최대 103개의 작업을 아우르는 네 가지 능력군을 지원합니다. 전문가 시스템과 비교했을 때, 우리의 접근 방식은 지시 사항의 범위를 확장하고, 도메인 간 일반화를 개선하며, 충실도를 향상시킵니다. 데이터 큐레이션과 학습 과정을 상세히 설명하며, 학제 간 학습이 전이 및 하류 작업의 신뢰성을 강화함을 보여줍니다. 이 모델, 지시 튜닝 데이터셋 및 평가 코드는 https://huggingface.co/SciReason와 https://github.com/open-sciencelab/SciReason에서 오픈소스로 공개되었습니다.

English

We present a scientific reasoning foundation model that aligns natural language with heterogeneous scientific representations. The model is pretrained on a 206B-token corpus spanning scientific text, pure sequences, and sequence-text pairs, then aligned via SFT on 40M instructions, annealed cold-start bootstrapping to elicit long-form chain-of-thought, and reinforcement learning with task-specific reward shaping, which instills deliberate scientific reasoning. It supports four capability families, covering up to 103 tasks across workflows: (i) faithful translation between text and scientific formats, (ii) text/knowledge extraction, (iii) property prediction, (iv) property classification, (v) unconditional and conditional sequence generation and design. Compared with specialist systems, our approach broadens instruction coverage, improves cross-domain generalization, and enhances fidelity. We detail data curation and training and show that cross-discipline learning strengthens transfer and downstream reliability. The model, instruct tuning datasets and the evaluation code are open-sourced at https://huggingface.co/SciReason and https://github.com/open-sciencelab/SciReason.

SciReasoner: 학제 간 과학적 추론 기반 구축

SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

초록

Support