PatRe:面向专利审查全流程的办公意见与答复生成基准
PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination
May 5, 2026
作者: Qiyao Wang, Xinyi Chen, Longze Chen, Hongbo Wang, Hamid Alinejad-Rokny, Yuan Lin, Min Yang
cs.AI
摘要
专利审查是一项复杂、多阶段的过程,既需要技术专长又涉及法律推理,且日益受到申请量激增的挑战。现有基准大多将专利审查视为判别式分类或静态信息提取,未能捕捉其与学术出版中同行评审和反驳流程类似的交互式、迭代式本质。本文提出首个模拟完整专利审查生命周期的基准PatRe,涵盖审查意见通知书生成与申请人反驳环节。该基准包含480个真实案例,支持全知检索和模拟检索两种评估模式。我们将专利审查重新定义为动态多轮论证与回应的过程。基于各类大语言模型的广泛实验揭示了关键发现:包括专有模型与开源模型的性能差异,以及审查员分析与申请人反驳任务间的不对称性。这些发现既凸显了大语言模型在模拟专利审查中复杂现实法律推理与技术新颖性判断方面的潜力,也揭示了其当前局限。我们公开代码和数据集以促进专利审查建模的后续研究。
English
Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.