回溯與追溯:與生成式語言模型對話
Recourse for reclamation: Chatting with generative language models
March 21, 2024
作者: Jennifer Chien, Kevin R. McKee, Jackie Kay, William Isaac
cs.AI
摘要
研究人員和開發者越來越依賴毒性評分來調節生成式語言模型的輸出,在客戶服務、信息檢索和內容生成等場景中。然而,毒性評分可能導致相關信息無法獲取,使文化規範僵化或「價值鎖定」,並阻礙語言重擁有過程,特別是對於邊緣化群體。在這項工作中,我們將算法回溯的概念擴展到生成式語言模型:我們為用戶提供一種新機制,通過動態設置毒性過濾的閾值來實現他們期望的預測。用戶因此相對於與基準系統互動,行使了更多的代理權。一項初步研究(n = 30)支持我們提出的回溯機制的潛力,顯示與模型輸出的固定閾值毒性過濾相比,在可用性方面有所改善。未來的工作應該探索毒性評分、模型可控性、用戶代理權和語言重擁有過程的交集,特別是關於許多社區在與生成式語言模型互動時遇到的偏見。
English
Researchers and developers increasingly rely on toxicity scoring to moderate
generative language model outputs, in settings such as customer service,
information retrieval, and content generation. However, toxicity scoring may
render pertinent information inaccessible, rigidify or "value-lock" cultural
norms, and prevent language reclamation processes, particularly for
marginalized people. In this work, we extend the concept of algorithmic
recourse to generative language models: we provide users a novel mechanism to
achieve their desired prediction by dynamically setting thresholds for toxicity
filtering. Users thereby exercise increased agency relative to interactions
with the baseline system. A pilot study (n = 30) supports the potential of
our proposed recourse mechanism, indicating improvements in usability compared
to fixed-threshold toxicity-filtering of model outputs. Future work should
explore the intersection of toxicity scoring, model controllability, user
agency, and language reclamation processes -- particularly with regard to the
bias that many communities encounter when interacting with generative language
models.Summary
AI-Generated Summary