PatRe:專利審查全流程審查意見與答辯生成基準
PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination
May 5, 2026
作者: Qiyao Wang, Xinyi Chen, Longze Chen, Hongbo Wang, Hamid Alinejad-Rokny, Yuan Lin, Min Yang
cs.AI
摘要
專利審查是一項複雜的多階段流程,既需要技術專長又涉及法律推理,而日益增長的申請量正不斷為其帶來挑戰。現有基準大多將專利審查視為判別式分類或靜態資訊提取任務,未能捕捉其與學術出版中同行評審與反駁流程相似的互動性與迭代性本質。本文提出首個模擬完整專利審查生命週期的基準PatRe,涵蓋審查意見通知書生成與申請人答辯環節。PatRe包含480個真實案例,支援基於標準答案與檢索模擬的雙重評估設定。我們將專利審查重新定義為一個動態多輪的論證與回應過程,通過對各類大型語言模型的廣泛實驗,揭示了模型表現的關鍵洞察:包括專有模型與開源模型之間的差異,以及審查員分析與申請方答辯的任務不對稱性。這些發現既凸顯了大型語言模型在模擬專利審查中複雜現實法律推理與技術新穎性判斷的潛力,也暴露了其當前侷限性。我們公開程式碼與資料集,以促進專利審查建模的未來研究。
English
Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.