MOOSE-Star: 複雑性の壁を打破し科学的発見のための扱いやすいトレーニングを実現

要旨

大規模言語モデル（LLM）は科学的発見において有望性を示すものの、既存研究は推論やフィードバック駆動型の訓練に焦点を当てており、生成的推論プロセスP(仮説|背景知識)（P(h|b)）の直接的なモデリングは未開拓のままであった。本論文では、膨大な知識ベースから着想を検索・構成する際に内在する組合せ爆発（O(N^k)）のため、P(h|b)を直接訓練することは数学的に扱い難いことを示す。この障壁を打破するため、我々は扱いやすい訓練とスケーラブルな推論を可能にする統一フレームワーク「MOOSE-Star」を提案する。最良の場合、MOOSE-Starは以下の3つの手法により複雑性を指数関数的から対数的（O(log N)）に低減する：（1）発見の確率論的方程式から導出した部分タスクへの分解による訓練、（2）対数的検索を可能にし無関係な部分空間を刈り込む動機付け誘導型階層的検索の採用、（3）検索ノイズに対する頑健性を確保するための有界合成の活用。これを促進するため、我々は訓練用に分解された108,717本の論文からなるデータセット「TOMATO-Star」（38,400 GPU時間）を公開する。さらに、力任せのサンプリングが「複雑性の壁」に突き当たる一方で、MOOSE-Starはテスト時における連続的なスケーリング特性を示すことを実証する。

English

While large language models (LLMs) show promise in scientific discovery, existing research focuses on inference or feedback-driven training, leaving the direct modeling of the generative reasoning process, P(hypothesis|background) (P(h|b)), unexplored. We demonstrate that directly training P(h|b) is mathematically intractable due to the combinatorial complexity (O(N^k)) inherent in retrieving and composing inspirations from a vast knowledge base. To break this barrier, we introduce MOOSE-Star, a unified framework enabling tractable training and scalable inference. In the best case, MOOSE-Star reduces complexity from exponential to logarithmic (O(log N)) by (1) training on decomposed subtasks derived from the probabilistic equation of discovery, (2) employing motivation-guided hierarchical search to enable logarithmic retrieval and prune irrelevant subspaces, and (3) utilizing bounded composition for robustness against retrieval noise. To facilitate this, we release TOMATO-Star, a dataset of 108,717 decomposed papers (38,400 GPU hours) for training. Furthermore, we show that while brute-force sampling hits a ''complexity wall,'' MOOSE-Star exhibits continuous test-time scaling.

MOOSE-Star: 複雑性の壁を打破し科学的発見のための扱いやすいトレーニングを実現

MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

要旨

Support