AIは科学的センスを学習できる

要旨

優れた科学者は、強い判断力と先見性を備えており、それは我々が「科学的センス」と呼ぶものと密接に関連している。ここではこの用語を、高いインパクトを持つ研究アイデアを評価し提案する能力を指すものとする。しかし、既存研究の大半はAI科学者の実行能力の向上に焦点を当てており、AIの科学的センスを高める研究は未開拓のままである。本研究では、大規模なコミュニティシグナルを教師信号として利用する訓練パラダイム「Reinforcement Learning from Community Feedback (RLCF)」を提案し、科学的センスの学習を選好モデリングとアライメント問題として定式化する。選好モデリングでは、70万組の分野・年代を一致させた高被引用論文と低被引用論文のペアを用いて、アイデアを評価する「Scientific Judge」を訓練する。選好アライメントでは、Scientific Judgeを報酬モデルとして用い、高い潜在インパクトを持つ研究アイデアを提案する方策モデル「Scientific Thinker」を訓練する。実験の結果、Scientific JudgeはSOTAの大規模言語モデル（GPT-5.2、Gemini 3 Pro等）を上回り、将来年度のテスト、未見分野、査読選好に対しても一般化可能であることを示す。さらにScientific Thinkerは、ベースラインよりも高い潜在インパクトを持つ研究アイデアを提案する。我々の発見は、AIが科学的センスを学習可能であることを示し、人間レベルのAI科学者実現に向けた重要な一歩を記すものである。

English

Great scientists have strong judgement and foresight, closely tied to what we call scientific taste. Here, we use the term to refer to the capacity to judge and propose research ideas with high potential impact. However, most relative research focuses on improving an AI scientist's executive capability, while enhancing an AI's scientific taste remains underexplored. In this work, we propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem. For preference modeling, we train Scientific Judge on 700K field- and time-matched pairs of high- vs. low-citation papers to judge ideas. For preference alignment, using Scientific Judge as a reward model, we train a policy model, Scientific Thinker, to propose research ideas with high potential impact. Experiments show Scientific Judge outperforms SOTA LLMs (e.g., GPT-5.2, Gemini 3 Pro) and generalizes to future-year test, unseen fields, and peer-review preference. Furthermore, Scientific Thinker proposes research ideas with higher potential impact than baselines. Our findings show that AI can learn scientific taste, marking a key step toward reaching human-level AI scientists.

AIは科学的センスを学習できる

AI Can Learn Scientific Taste

要旨

Support