APRES:一种智能论文修订与评估系统
APRES: An Agentic Paper Revision and Evaluation System
March 3, 2026
作者: Bingchen Zhao, Jenny Zhang, Chenxi Whitehouse, Minqi Jiang, Michael Shvartsman, Abhishek Charnalia, Despoina Magka, Tatiana Shavrina, Derek Dunfield, Oisin Mac Aodha, Yoram Bachrach
cs.AI
摘要
科學發現必須通過清晰傳播才能充分實現其潛在價值。若缺乏有效溝通,即便是最具突破性的研究成果也可能面臨被忽視或誤解的風險。目前科學家主要通過同行評審機制來交流研究成果並獲取學界反饋,但現行體系常因評審意見存在分歧,反而阻礙論文的完善進程並限制其潛在影響力。本文提出一種基於大語言模型的新型方法APRES,可根據評估標準自動優化學術論文文本。該自動化方法能挖掘出對未來引用量具有高預測效度的評估體系,並將其與APRES整合為自動修訂系統,以提升論文質量與影響力。關鍵在於,這一優化過程需在不改變核心科學內容的前提下實現。實驗結果表明:APRES將未來引用量的平均絕對誤差預測精度較次優基準提升19.6%;經修訂的論文在人類專家評審中獲得79%的偏好率。本研究為將大語言模型作為作者投稿前的論證壓力測試工具提供了實證支持。我們的最終目標是增強而非取代人類專家的核心評審作用——畢竟唯有人類才能甄別真正重要的科學發現,引導科學朝著增進知識、豐富生活的方向邁進。
English
Scientific discoveries must be communicated clearly to realize their full potential. Without effective communication, even the most groundbreaking findings risk being overlooked or misunderstood. The primary way scientists communicate their work and receive feedback from the community is through peer review. However, the current system often provides inconsistent feedback between reviewers, ultimately hindering the improvement of a manuscript and limiting its potential impact. In this paper, we introduce a novel method APRES powered by Large Language Models (LLMs) to update a scientific papers text based on an evaluation rubric. Our automated method discovers a rubric that is highly predictive of future citation counts, and integrate it with APRES in an automated system that revises papers to enhance their quality and impact. Crucially, this objective should be met without altering the core scientific content. We demonstrate the success of APRES, which improves future citation prediction by 19.6% in mean averaged error over the next best baseline, and show that our paper revision process yields papers that are preferred over the originals by human expert evaluators 79% of the time. Our findings provide strong empirical support for using LLMs as a tool to help authors stress-test their manuscripts before submission. Ultimately, our work seeks to augment, not replace, the essential role of human expert reviewers, for it should be humans who discern which discoveries truly matter, guiding science toward advancing knowledge and enriching lives.