基於持久工作流提示、元提示與元推理的AI驅動學術同行評審
AI-Driven Scholarly Peer Review via Persistent Workflow Prompting, Meta-Prompting, and Meta-Reasoning
May 6, 2025
作者: Evgeny Markhasin
cs.AI
摘要
對科學手稿進行同行評審,對於大型語言模型(LLMs)而言,是一項重大挑戰,這部分源於數據限制及專家推理的複雜性。本報告介紹了一種可能廣泛適用的提示工程方法——持續工作流程提示(Persistent Workflow Prompting, PWP),旨在利用標準的LLM聊天界面(無需編碼,無需API)來彌合這一差距。我們展示了一個針對實驗化學手稿批判性分析的概念驗證PWP提示,其採用分層、模塊化的架構(通過Markdown結構化),定義了詳細的分析工作流程。我們通過迭代應用元提示技術和元推理,開發了這一PWP提示,旨在系統地編碼專家評審工作流程,包括隱性知識。在會話開始時一次性提交的這一PWP提示,為LLM配備了由後續查詢觸發的持續工作流程,引導現代推理LLMs進行系統性、多模態的評估。演示表明,PWP引導下的LLM在測試案例中識別出主要方法論缺陷,同時減輕了LLM輸入偏見,並執行了包括區分主張與證據、整合文本/照片/圖表分析以推斷參數、執行定量可行性檢查、將估計值與主張進行比較以及評估先驗合理性在內的複雜任務。為確保透明度並促進複製,我們提供了完整的提示、詳細的演示分析以及互動聊天記錄作為補充資源。除了特定應用外,這項工作還深入探討了元開發過程本身,強調了PWP在詳細工作流程形式化的指導下,利用現成LLMs進行複雜科學任務的潛力,以實現精細分析。
English
Critical peer review of scientific manuscripts presents a significant
challenge for Large Language Models (LLMs), partly due to data limitations and
the complexity of expert reasoning. This report introduces Persistent Workflow
Prompting (PWP), a potentially broadly applicable prompt engineering
methodology designed to bridge this gap using standard LLM chat interfaces
(zero-code, no APIs). We present a proof-of-concept PWP prompt for the critical
analysis of experimental chemistry manuscripts, featuring a hierarchical,
modular architecture (structured via Markdown) that defines detailed analysis
workflows. We develop this PWP prompt through iterative application of
meta-prompting techniques and meta-reasoning aimed at systematically codifying
expert review workflows, including tacit knowledge. Submitted once at the start
of a session, this PWP prompt equips the LLM with persistent workflows
triggered by subsequent queries, guiding modern reasoning LLMs through
systematic, multimodal evaluations. Demonstrations show the PWP-guided LLM
identifying major methodological flaws in a test case while mitigating LLM
input bias and performing complex tasks, including distinguishing claims from
evidence, integrating text/photo/figure analysis to infer parameters, executing
quantitative feasibility checks, comparing estimates against claims, and
assessing a priori plausibility. To ensure transparency and facilitate
replication, we provide full prompts, detailed demonstration analyses, and logs
of interactive chats as supplementary resources. Beyond the specific
application, this work offers insights into the meta-development process
itself, highlighting the potential of PWP, informed by detailed workflow
formalization, to enable sophisticated analysis using readily available LLMs
for complex scientific tasks.Summary
AI-Generated Summary