PatRe: 특허 심사를 위한 전 단계 심사 통지서 및 반론 생성 벤치마크

초록

특허 심사는 기술적 전문성과 법적 추론을 모두 요구하는 복잡한 다단계 과정으로, 신청 건수의 증가로 인해 점점 더 어려움을 겪고 있습니다. 기존 벤치마크는 주로 특허 심사를 판별적 분류나 정적 추출로 간주하여, 학계 출판의 동료 검토 및 반론 과정과 유사한 상호작용적이고 반복적인 본질을 제대로 포착하지 못했습니다. 본 논문에서는 특허 심사의 전 과정, 즉 거절 이유 통지서 생성과 출원인 반론을 포함한 첫 번째 벤치마크인 PatRe를 소개합니다. PatRe는 480개의 실제 사례로 구성되어 있으며, 오라클 및 검색 기반 시뮬레이션 평가 환경을 모두 지원합니다. 우리의 벤치마크는 특허 심사를 정당화와 응답의 동적 다중 턴 과정으로 재정의합니다. 다양한 LLM을 대상으로 한 광범위한 실험을 통해, 독점 모델과 오픈소스 모델 간의 차이, 심사관 분석과 출원인 측 반론 간의 과제 비대칭성을 포함한 모델 성능에 대한 중요한 통찰력을 도출했습니다. 이러한 결과는 복잡한 실제 법적 추론과 기술적 신규성 판단을 모델링하는 데 있어 LLM의 잠재력과 현재 한계를 동시에 보여줍니다. 우리는 특허 심사 모델링에 대한 향후 연구를 촉진하기 위해 코드와 데이터셋을 공개합니다.

English

Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.

PatRe: 특허 심사를 위한 전 단계 심사 통지서 및 반론 생성 벤치마크

PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

초록

Support