ChatPaper.aiChatPaper

Locket:面向语言模型的鲁棒特征锁定技术

Locket: Robust Feature-Locking Technique for Language Models

October 14, 2025
作者: Lipeng He, Vasisht Duddu, N. Asokan
cs.AI

摘要

聊天机器人提供商(如OpenAI)依赖分层订阅模式创收,为免费用户提供基础模型,为付费用户提供高级模型。然而,针对特定高级功能(如数学、编程)的精细化付费解锁方案被认为对提供商更具经济可行性。此类方案需要一种功能锁定技术(FLoTE),该技术需满足:(i) 有效拒绝锁定功能,(ii) 对已解锁功能保持效用,(iii) 防止规避或未经授权的凭证共享,(iv) 能够扩展到多功能和多用户场景。然而,现有的FLoTEs(如密码锁定模型)既不健壮也不具备扩展性。我们提出了Locket,这是首个实现付费解锁方案的健壮且可扩展的FLoTE。Locket采用一种新颖的融合方法,将适配器附加到大型语言模型(LLM)上,以拒绝未经授权的功能。我们的全面评估表明,Locket在有效性(对锁定功能的拒绝率达100%)、效用保持(已解锁功能的效用降低≤7%)、健壮性(攻击成功率≤5%)以及扩展到多功能和多客户端方面均表现出色。
English
Chatbot providers (e.g., OpenAI) rely on tiered subscription schemes to generate revenue, offering basic models for free users, and advanced models for paying subscribers. However, a finer-grained pay-to-unlock scheme for premium features (e.g., math, coding) is thought to be more economically viable for the providers. Such a scheme requires a feature-locking technique (FLoTE) which is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and users. However, existing FLoTEs (e.g., password-locked models) are not robust or scalable. We present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. Locket uses a novel merging approach to attach adapters to an LLM for refusing unauthorized features. Our comprehensive evaluation shows that Locket is effective (100% refusal on locked features), utility-preserving (leq 7% utility degradation in unlocked features), robust (leq 5% attack success rate), and scales to multiple features and clients.
PDF12October 15, 2025