回溯与索赔:与生成式语言模型的对话
Recourse for reclamation: Chatting with generative language models
March 21, 2024
作者: Jennifer Chien, Kevin R. McKee, Jackie Kay, William Isaac
cs.AI
摘要
研究人员和开发者越来越倚赖毒性评分来调节生成式语言模型的输出,在客户服务、信息检索和内容生成等环境中。然而,毒性评分可能导致相关信息无法获取,使文化规范僵化或“价值锁定”,并阻碍语言重建过程,特别是对边缘化群体而言。在这项工作中,我们将算法补救的概念扩展到生成式语言模型:我们为用户提供一种新机制,通过动态设置毒性过滤的阈值来实现他们期望的预测。用户因此相对于与基线系统的交互增加了主动性。一项试点研究(n = 30)支持我们提出的补救机制的潜力,表明与模型输出的固定阈值毒性过滤相比,在可用性方面有所改善。未来的工作应探讨毒性评分、模型可控性、用户主动性和语言重建过程的交集,特别是关于许多社区在与生成式语言模型交互时遇到的偏见。
English
Researchers and developers increasingly rely on toxicity scoring to moderate
generative language model outputs, in settings such as customer service,
information retrieval, and content generation. However, toxicity scoring may
render pertinent information inaccessible, rigidify or "value-lock" cultural
norms, and prevent language reclamation processes, particularly for
marginalized people. In this work, we extend the concept of algorithmic
recourse to generative language models: we provide users a novel mechanism to
achieve their desired prediction by dynamically setting thresholds for toxicity
filtering. Users thereby exercise increased agency relative to interactions
with the baseline system. A pilot study (n = 30) supports the potential of
our proposed recourse mechanism, indicating improvements in usability compared
to fixed-threshold toxicity-filtering of model outputs. Future work should
explore the intersection of toxicity scoring, model controllability, user
agency, and language reclamation processes -- particularly with regard to the
bias that many communities encounter when interacting with generative language
models.Summary
AI-Generated Summary