ChatPaper.aiChatPaper

SciAtlas:一個用於自動化科學研究的大規模知識圖譜

SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research

May 20, 2026
作者: Shuofei Qiao, Yunxiang Wei, Jiazheng Fan, Bin Wu, Busheng Zhang, Mengru Wang, Yuqi Zhu, Ningyu Zhang, Keyan Ding, Qiang Zhang, Huajun Chen
cs.AI

摘要

全球學術產出的指數級增長,使研究人員與AI代理面臨前所未有的「資訊爆炸」,碎片化且缺乏結構的知識組織方式阻礙了深層跨學科整合。目前的學術檢索工具主要依賴於淺層關鍵詞匹配或向量空間語義檢索,缺乏導航複雜邏輯關聯所需的拓撲推理能力。基於代理的深度研究框架往往容易產生邏輯幻覺且消耗高昂的推理成本。為填補此缺口,本報告介紹SciAtlas——一個大規模、跨學科、異質學術資源的知識圖譜,設計為全景式的科學演化網絡。通過整合來自26個學科的4300萬篇論文,總計1.57億個實體與30億個三元組,SciAtlas提供了結構化的拓撲認知基礎,打破學科壁壘,為AI代理提供全球視角。此外,我們開發了一種神經符號檢索演算法,結合三路協同召回與圖重排序,實現從簡單語義匹配到確定性關聯發現的無縫過渡。我們也展示了SciAtlas的關鍵應用方向,包括文獻綜述、自動化研究趨勢綜合、觀點定位與學術軌跡探索,以證明SciAtlas能夠作為有效的「認知地圖」,賦能自動化科學研究的完整循環,同時顯著降低推理成本。我們已在GitHub倉庫中釋出KG檢索及各類下游任務的介面。
English
The exponential growth of global academic output has confronted researchers and AI agents with an unprecedented ``information explosion,'' where fragmented and unstructured knowledge organization impedes deep interdisciplinary integration. Current academic retrieval tools predominantly rely on superficial keyword matching or vector-space semantic retrieval, which lack the topological reasoning capabilities required to navigate complex logical connections. Agentic deep-research-based frameworks are often prone to logical hallucinations and consuming high inference costs. To bridge this gap, in this report, we introduce SciAtlas, a large-scale, multi-disciplinary, heterogeneous academic resource knowledge graph designed as a panoramic scientific evolution network. By integrating over 43M papers from 26 disciplines, and a total of 157M entities and 3B triplets, SciAtlas provides a structured topological cognitive substrate that dismantles disciplinary barriers and furnishes AI agents with a global perspective. Furthermore, we develop a neuro-symbolic retrieval algorithm featuring tri-path collaborative recall and graph reranking, achieving a seamless transition from simple semantic matching to deterministic association discovery. We also present key application directions of SciAtlas, including literature review, automated research trend synthesis, idea positioning, and academic trajectory exploration, to demonstrate that SciAtlas can serve as an effective ``cognitive map'' to empower the full loop of automated scientific research while significantly reducing reasoning costs. We have released the interfaces for KG retrieval and various downstream tasks in our GitHub repo.