最悪ケースのLLM著作権侵害の認証済み緩和策

要旨

大規模言語モデル（LLM）が事前学習中に著作物にさらされることにより、展開後の意図しない著作権侵害の懸念が生じています。これにより、「著作権削除」手法の開発が進められており、これはモデルが著作物と実質的に類似したコンテンツを生成するのを防ぐことを目的とした学習後のアプローチです。現在の緩和策は平均的なリスクに対してはある程度有効ですが、最悪のケースにおける著作権リスク、特に著作物からの長い逐語的な引用の存在を見落としていることを示します。私たちはBloomScrubを提案します。これは非常にシンプルでありながら極めて効果的な推論時アプローチであり、認証された著作権削除を提供します。この手法では、引用検出と書き換え技術を繰り返し組み合わせることで、潜在的に侵害するセグメントを変換します。効率的なデータスケッチ（Bloomフィルタ）を活用することで、大規模な実世界のコーパスに対してもスケーラブルな著作権スクリーニングを可能にします。長さの閾値を超える引用が削除できない場合、システムは応答を控えることで、認証されたリスク低減を提供します。実験結果は、BloomScrubが侵害リスクを低減し、有用性を維持し、適応的な応答控除を通じて異なるレベルの執行厳格度に対応することを示しています。私たちの結果は、軽量な推論時手法が著作権予防において驚くほど効果的であることを示唆しています。

English

The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods, post-training approaches aimed at preventing models from generating content substantially similar to copyrighted ones. While current mitigation approaches are somewhat effective for average-case risks, we demonstrate that they overlook worst-case copyright risks exhibits by the existence of long, verbatim quotes from copyrighted sources. We propose BloomScrub, a remarkably simple yet highly effective inference-time approach that provides certified copyright takedown. Our method repeatedly interleaves quote detection with rewriting techniques to transform potentially infringing segments. By leveraging efficient data sketches (Bloom filters), our approach enables scalable copyright screening even for large-scale real-world corpora. When quotes beyond a length threshold cannot be removed, the system can abstain from responding, offering certified risk reduction. Experimental results show that BloomScrub reduces infringement risk, preserves utility, and accommodates different levels of enforcement stringency with adaptive abstention. Our results suggest that lightweight, inference-time methods can be surprisingly effective for copyright prevention.

最悪ケースのLLM著作権侵害の認証済み緩和策

Certified Mitigation of Worst-Case LLM Copyright Infringement

要旨

Support