スペーサー：エンジニアリングされた科学的インスピレーションに向けて

要旨

近年のLLM（大規模言語モデル）の進展により、自動化された科学研究は人工超知能への道における次の最前線となっている。しかし、これらのシステムは狭い範囲のタスクに限定されるか、あるいはLLMの限られた創造能力に縛られている。本研究では、外部の介入なしに創造的かつ事実に基づいた概念を開発する科学的発見システム「Spacer」を提案する。Spacerは「意図的な脱文脈化」というアプローチを通じてこれを実現しようとする。このアプローチでは、情報を原子単位（キーワード）に分解し、それらの間の未探索のつながりから創造性を引き出す。Spacerは、(i) キーワードセットを構築するインスピレーションエンジン「Nuri」と、(ii) これらのセットを洗練された科学的記述に変換する「Manifesting Pipeline」で構成される。Nuriは、生物学分野の18万件の学術論文から構築されたキーワードグラフから、新規で高いポテンシャルを持つキーワードセットを抽出する。Manifesting Pipelineは、キーワード間の関連性を見つけ、それらの論理構造を分析し、妥当性を検証し、最終的に独自の科学的概念を起草する。実験によると、Nuriの評価指標は、AUROCスコア0.737で高インパクト論文を正確に分類する。また、Manifesting Pipelineは、最新のトップジャーナル論文の核心概念を、そのキーワードセットのみから再構築することに成功した。LLMベースのスコアリングシステムによると、この再構築は85%以上のケースで妥当であると推定された。最後に、埋め込み空間分析により、Spacerの出力はSOTA（最先端）LLMの出力と比較して、主要な論文に有意に類似していることが示された。

English

Recent advances in LLMs have made automated scientific research the next frontline in the path to artificial superintelligence. However, these systems are bound either to tasks of narrow scope or the limited creative capabilities of LLMs. We propose Spacer, a scientific discovery system that develops creative and factually grounded concepts without external intervention. Spacer attempts to achieve this via 'deliberate decontextualization,' an approach that disassembles information into atomic units - keywords - and draws creativity from unexplored connections between them. Spacer consists of (i) Nuri, an inspiration engine that builds keyword sets, and (ii) the Manifesting Pipeline that refines these sets into elaborate scientific statements. Nuri extracts novel, high-potential keyword sets from a keyword graph built with 180,000 academic publications in biological fields. The Manifesting Pipeline finds links between keywords, analyzes their logical structure, validates their plausibility, and ultimately drafts original scientific concepts. According to our experiments, the evaluation metric of Nuri accurately classifies high-impact publications with an AUROC score of 0.737. Our Manifesting Pipeline also successfully reconstructs core concepts from the latest top-journal articles solely from their keyword sets. An LLM-based scoring system estimates that this reconstruction was sound for over 85% of the cases. Finally, our embedding space analysis shows that outputs from Spacer are significantly more similar to leading publications compared with those from SOTA LLMs.

スペーサー：エンジニアリングされた科学的インスピレーションに向けて

Spacer: Towards Engineered Scientific Inspiration

要旨

Support