ChatPaper.aiChatPaper

揭开职业偏见:利用美国劳动数据对基于实证的LLM进行去偏见化

Unboxing Occupational Bias: Grounded Debiasing LLMs with U.S. Labor Data

August 20, 2024
作者: Atmika Gorti, Manas Gaur, Aman Chadha
cs.AI

摘要

大型语言模型(LLMs)往往会继承和放大其训练数据中嵌入的社会偏见,可能会强化与性别、职业和其他敏感类别相关的有害刻板印象。这一问题尤为棘手,因为存在偏见的LLMs可能会导致深远影响,导致不公平做法,并加剧招聘、在线内容管理甚至刑事司法系统等各个领域的社会不平等。尽管先前的研究侧重于使用专门设计的数据集来检测LLMs中的偏见,以凸显内在偏见,但对这些发现与权威数据集(如美国劳工统计局(NBLS)数据)之间的关联缺乏研究。为填补这一空白,我们进行了实证研究,评估LLMs在“开箱即用偏见”环境中的表现,分析生成的输出与NBLS数据中发现的分布的比较。此外,我们提出了一个简单而有效的去偏见机制,直接将NBLS实例纳入以减轻LLMs内的偏见。我们的研究涵盖了七种不同的LLMs,包括可指导的、基础和专家混合模型,并揭示了通常被现有偏见检测技术忽视的显著偏见水平。重要的是,我们的去偏见方法不依赖外部数据集,显示出偏见得分显著降低,突显了我们方法在创建更公平、更可靠的LLMs方面的有效性。
English
Large Language Models (LLMs) are prone to inheriting and amplifying societal biases embedded within their training data, potentially reinforcing harmful stereotypes related to gender, occupation, and other sensitive categories. This issue becomes particularly problematic as biased LLMs can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities across various domains, such as recruitment, online content moderation, or even the criminal justice system. Although prior research has focused on detecting bias in LLMs using specialized datasets designed to highlight intrinsic biases, there has been a notable lack of investigation into how these findings correlate with authoritative datasets, such as those from the U.S. National Bureau of Labor Statistics (NBLS). To address this gap, we conduct empirical research that evaluates LLMs in a ``bias-out-of-the-box" setting, analyzing how the generated outputs compare with the distributions found in NBLS data. Furthermore, we propose a straightforward yet effective debiasing mechanism that directly incorporates NBLS instances to mitigate bias within LLMs. Our study spans seven different LLMs, including instructable, base, and mixture-of-expert models, and reveals significant levels of bias that are often overlooked by existing bias detection techniques. Importantly, our debiasing method, which does not rely on external datasets, demonstrates a substantial reduction in bias scores, highlighting the efficacy of our approach in creating fairer and more reliable LLMs.

Summary

AI-Generated Summary

PDF54November 16, 2024