OpenNovelty:基于大语言模型的学术创新性可验证评估代理系统
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment
January 4, 2026
作者: Ming Zhang, Kexin Tan, Yueyuan Huang, Yujiong Shen, Chunchun Ma, Li Ju, Xinran Zhang, Yuhui Wang, Wenqing Jing, Jingyi Deng, Huayu Sha, Binze Hu, Jingqi Tong, Changhao Jiang, Yage Geng, Yuankai Ying, Yue Zhang, Zhangyue Yin, Zhiheng Xi, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang
cs.AI
摘要
在学术评审中,新颖性评估虽至关重要却充满挑战,评审人需将投稿与海量且快速更新的文献进行比对。本报告推出OpenNovelty——一个基于大语言模型的智能代理系统,通过四阶段流程实现透明化、证据驱动的新颖性分析:(1)提取核心任务与贡献声明以生成检索查询;(2)通过语义搜索引擎基于查询获取相关前人研究;(3)构建核心任务相关研究的层级分类体系,并对每项贡献进行全文级对比;(4)整合所有分析形成结构化新颖性报告,附明确引证与证据片段。与简单基于大语言模型的方法不同,OpenNovelty将所有评估锚定于真实检索到的论文,确保判断可验证。我们将该系统部署于500余篇ICLR 2026投稿,所有报告公开于项目网站,初步分析表明其能有效识别相关前人研究(包括作者可能忽略的密切关联文献)。OpenNovelty旨在为科研社群提供可扩展工具,推动公平、一致且有据可依的同行评审。
English
Evaluating novelty is critical yet challenging in peer review, as reviewers must assess submissions against a vast, rapidly evolving literature. This report presents OpenNovelty, an LLM-powered agentic system for transparent, evidence-based novelty analysis. The system operates through four phases: (1) extracting the core task and contribution claims to generate retrieval queries; (2) retrieving relevant prior work based on extracted queries via semantic search engine; (3) constructing a hierarchical taxonomy of core-task-related work and performing contribution-level full-text comparisons against each contribution; and (4) synthesizing all analyses into a structured novelty report with explicit citations and evidence snippets. Unlike naive LLM-based approaches, OpenNovelty grounds all assessments in retrieved real papers, ensuring verifiable judgments. We deploy our system on 500+ ICLR 2026 submissions with all reports publicly available on our website, and preliminary analysis suggests it can identify relevant prior work, including closely related papers that authors may overlook. OpenNovelty aims to empower the research community with a scalable tool that promotes fair, consistent, and evidence-backed peer review.