評估生成式人工智慧系統在系統和社會中的社會影響
Evaluating the Social Impact of Generative AI Systems in Systems and Society
June 9, 2023
作者: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, Yacine Jernite, Alexandra Sasha Luccioni, Alberto Lusoli, Margaret Mitchell, Jessica Newman, Marie-Therese Png, Andrew Strait, Apostol Vassilev
cs.AI
摘要
跨模態的生成式人工智慧系統,涵蓋文字、圖像、音訊和視頻等,對社會產生廣泛影響,但目前尚無官方標準來評估這些影響以及應該評估哪些影響。我們致力於建立一種標準方法,用於評估任何模態的生成式人工智慧系統,分為兩個主要類別:在沒有預定應用的基礎系統中可以評估的內容,以及在社會中可以評估的內容。我們描述了特定的社會影響類別,以及如何處理和進行基礎技術系統、人們和社會的評估。我們的基礎系統框架界定了七個社會影響類別:偏見、刻板印象和代表性傷害;文化價值和敏感內容;性能差異;隱私和數據保護;財務成本;環境成本;以及數據和內容審查勞動成本。評估的建議方法適用於所有模態,並分析現有評估的局限性,作為未來評估必要投資的起點。我們提出了五個在社會中可以評估的主要類別,每個類別都有其自己的子類別:可信度和自主性;不平等、邊緣化和暴力;權威集中;勞動和創造力;以及生態系統和環境。每個子類別都包括減輕危害的建議。我們同時正在為人工智慧研究社區打造一個評估存儲庫,以便貢獻現有評估,符合所述的類別。此版本將在2023年ACM FAccT的CRAFT會議後進行更新。
English
Generative AI systems across modalities, ranging from text, image, audio, and
video, have broad social impacts, but there exists no official standard for
means of evaluating those impacts and which impacts should be evaluated. We
move toward a standard approach in evaluating a generative AI system for any
modality, in two overarching categories: what is able to be evaluated in a base
system that has no predetermined application and what is able to be evaluated
in society. We describe specific social impact categories and how to approach
and conduct evaluations in the base technical system, then in people and
society. Our framework for a base system defines seven categories of social
impact: bias, stereotypes, and representational harms; cultural values and
sensitive content; disparate performance; privacy and data protection;
financial costs; environmental costs; and data and content moderation labor
costs. Suggested methods for evaluation apply to all modalities and analyses of
the limitations of existing evaluations serve as a starting point for necessary
investment in future evaluations. We offer five overarching categories for what
is able to be evaluated in society, each with their own subcategories:
trustworthiness and autonomy; inequality, marginalization, and violence;
concentration of authority; labor and creativity; and ecosystem and
environment. Each subcategory includes recommendations for mitigating harm. We
are concurrently crafting an evaluation repository for the AI research
community to contribute existing evaluations along the given categories. This
version will be updated following a CRAFT session at ACM FAccT 2023.