LegalSearchLM: 法的要素生成としての判例検索の再考

要旨

判例検索（Legal Case Retrieval, LCR）は、クエリとなる判例から関連する判例を検索するものであり、法律専門家の研究や意思決定における基本的なタスクである。しかし、既存のLCR研究には二つの主要な課題がある。第一に、比較的小規模な検索コーパス（例：100～55,000件の判例）で評価されており、また、刑事事件のクエリタイプの範囲が狭く、現実世界の法律検索シナリオの複雑性を十分に反映できていない。第二に、埋め込みベースまたは字句マッチング手法に依存しているため、表現が限定的であり、法的に関連性の低いマッチングが生じることが多い。これらの課題を解決するため、本研究では以下を提案する：（1）LEGAR BENCH、韓国初の大規模LCRベンチマークで、120万件以上の判例を対象に411種類の多様な犯罪タイプをカバーするクエリを提供する；（2）LegalSearchLM、クエリ判例に対して法的要素の推論を行い、制約付きデコーディングを通じて対象判例に基づいた内容を直接生成する検索モデル。実験結果は、LegalSearchLMがLEGAR BENCHにおいてベースラインを6～20％上回り、最先端の性能を達成することを示している。また、ドメイン外の判例に対しても強い汎化性能を示し、ドメイン内データで訓練された単純な生成モデルを15％上回る結果を得た。

English

Legal Case Retrieval (LCR), which retrieves relevant cases from a query case, is a fundamental task for legal professionals in research and decision-making. However, existing studies on LCR face two major limitations. First, they are evaluated on relatively small-scale retrieval corpora (e.g., 100-55K cases) and use a narrow range of criminal query types, which cannot sufficiently reflect the complexity of real-world legal retrieval scenarios. Second, their reliance on embedding-based or lexical matching methods often results in limited representations and legally irrelevant matches. To address these issues, we present: (1) LEGAR BENCH, the first large-scale Korean LCR benchmark, covering 411 diverse crime types in queries over 1.2M legal cases; and (2) LegalSearchLM, a retrieval model that performs legal element reasoning over the query case and directly generates content grounded in the target cases through constrained decoding. Experimental results show that LegalSearchLM outperforms baselines by 6-20% on LEGAR BENCH, achieving state-of-the-art performance. It also demonstrates strong generalization to out-of-domain cases, outperforming naive generative models trained on in-domain data by 15%.

LegalSearchLM: 法的要素生成としての判例検索の再考

LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation

要旨

Support