ChatPaper.aiChatPaper

認證緩解大型語言模型最壞情況下的版權侵權風險

Certified Mitigation of Worst-Case LLM Copyright Infringement

April 22, 2025
作者: Jingyu Zhang, Jiacan Yu, Marc Marone, Benjamin Van Durme, Daniel Khashabi
cs.AI

摘要

大型語言模型(LLMs)在預訓練階段接觸到受版權保護的素材,引發了部署後可能無意中侵犯版權的擔憂。這促使了「版權下架」方法的發展,這些後訓練方法旨在防止模型生成與受版權保護內容實質相似的輸出。雖然現有的緩解措施在應對一般風險方面有一定效果,但我們證明它們忽略了最壞情況下的版權風險,這體現在模型可能生成來自受版權保護來源的長篇逐字引用。我們提出了BloomScrub,這是一種極其簡單卻高效的推理時方法,提供認證的版權下架。我們的方法通過反覆交織引用檢測與重寫技術,來轉換可能侵權的段落。通過利用高效的數據草圖(布隆過濾器),我們的方法能夠對大規模現實世界語料庫進行可擴展的版權篩查。當無法移除超過長度閾值的引用時,系統可以選擇不回應,從而提供認證的風險降低。實驗結果顯示,BloomScrub降低了侵權風險,保持了實用性,並通過自適應的棄權機制適應了不同層次的執法嚴格性。我們的結果表明,輕量級的推理時方法在版權預防方面可能出人意料地有效。
English
The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods, post-training approaches aimed at preventing models from generating content substantially similar to copyrighted ones. While current mitigation approaches are somewhat effective for average-case risks, we demonstrate that they overlook worst-case copyright risks exhibits by the existence of long, verbatim quotes from copyrighted sources. We propose BloomScrub, a remarkably simple yet highly effective inference-time approach that provides certified copyright takedown. Our method repeatedly interleaves quote detection with rewriting techniques to transform potentially infringing segments. By leveraging efficient data sketches (Bloom filters), our approach enables scalable copyright screening even for large-scale real-world corpora. When quotes beyond a length threshold cannot be removed, the system can abstain from responding, offering certified risk reduction. Experimental results show that BloomScrub reduces infringement risk, preserves utility, and accommodates different levels of enforcement stringency with adaptive abstention. Our results suggest that lightweight, inference-time methods can be surprisingly effective for copyright prevention.

Summary

AI-Generated Summary

PDF71April 30, 2025