間隔物:邁向工程化的科學啟迪
Spacer: Towards Engineered Scientific Inspiration
August 25, 2025
作者: Minhyeong Lee, Suyoung Hwang, Seunghyun Moon, Geonho Nah, Donghyun Koh, Youngjun Cho, Johyun Park, Hojin Yoo, Jiho Park, Haneul Choi, Sungbin Moon, Taehoon Hwang, Seungwon Kim, Jaeyeong Kim, Seongjun Kim, Juneau Jung
cs.AI
摘要
近期大型語言模型(LLMs)的進展,使得自動化科學研究成為邁向人工超級智能的下一個前沿陣地。然而,這些系統要么局限於狹窄的任務範疇,要么受限於LLMs有限的創造能力。我們提出了Spacer,這是一個無需外部干預即可開發創意且基於事實的科學發現系統。Spacer試圖通過“刻意去情境化”來實現這一目標,該方法將信息分解為原子單元——關鍵詞,並從這些關鍵詞之間未被探索的聯繫中汲取創造力。Spacer由兩部分組成:(i) Nuri,一個構建關鍵詞集的靈感引擎,以及(ii) 將這些關鍵詞集精煉為詳盡科學陳述的顯化管道。Nuri從一個包含180,000篇生物學領域學術論文的關鍵詞圖譜中提取新穎且具有高潛力的關鍵詞集。顯化管道則尋找關鍵詞之間的聯繫,分析其邏輯結構,驗證其合理性,並最終起草原創的科學概念。根據我們的實驗,Nuri的評估指標能夠準確分類高影響力出版物,其AUROC得分為0.737。我們的顯化管道也成功地僅憑關鍵詞集重建了最新頂級期刊文章的核心概念。基於LLM的評分系統估計,這種重建在超過85%的情況下是可靠的。最後,我們的嵌入空間分析顯示,與當前最先進的LLMs相比,Spacer的輸出與領先出版物的相似度顯著更高。
English
Recent advances in LLMs have made automated scientific research the next
frontline in the path to artificial superintelligence. However, these systems
are bound either to tasks of narrow scope or the limited creative capabilities
of LLMs. We propose Spacer, a scientific discovery system that develops
creative and factually grounded concepts without external intervention. Spacer
attempts to achieve this via 'deliberate decontextualization,' an approach that
disassembles information into atomic units - keywords - and draws creativity
from unexplored connections between them. Spacer consists of (i) Nuri, an
inspiration engine that builds keyword sets, and (ii) the Manifesting Pipeline
that refines these sets into elaborate scientific statements. Nuri extracts
novel, high-potential keyword sets from a keyword graph built with 180,000
academic publications in biological fields. The Manifesting Pipeline finds
links between keywords, analyzes their logical structure, validates their
plausibility, and ultimately drafts original scientific concepts. According to
our experiments, the evaluation metric of Nuri accurately classifies
high-impact publications with an AUROC score of 0.737. Our Manifesting Pipeline
also successfully reconstructs core concepts from the latest top-journal
articles solely from their keyword sets. An LLM-based scoring system estimates
that this reconstruction was sound for over 85% of the cases. Finally, our
embedding space analysis shows that outputs from Spacer are significantly more
similar to leading publications compared with those from SOTA LLMs.