Un Framework per la Misurazione Automatica dei Danni dell'IA Responsabile nelle Applicazioni di IA Generativa

Abstract

Presentiamo un framework per la misurazione automatizzata delle metriche di Intelligenza Artificiale Responsabile (RAI) per i modelli linguistici di grandi dimensioni (LLM) e i relativi prodotti e servizi. Il nostro framework per misurare automaticamente i danni causati dagli LLM si basa su competenze tecniche e sociotecniche esistenti e sfrutta le capacità degli LLM più avanzati, come GPT-4. Utilizziamo questo framework per condurre diversi casi di studio che indagano come diversi LLM possano violare una serie di principi legati alla RAI. Il framework può essere impiegato insieme a competenze sociotecniche specifiche del dominio per creare misurazioni per nuove aree di danno in futuro. Implementando questo framework, miriamo a consentire sforzi di misurazione dei danni più avanzati e a promuovere un uso responsabile degli LLM.

English

We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services. Our framework for automatically measuring harms from LLMs builds on existing technical and sociotechnical expertise and leverages the capabilities of state-of-the-art LLMs, such as GPT-4. We use this framework to run through several case studies investigating how different LLMs may violate a range of RAI-related principles. The framework may be employed alongside domain-specific sociotechnical expertise to create measurements for new harm areas in the future. By implementing this framework, we aim to enable more advanced harm measurement efforts and further the responsible use of LLMs.

Un Framework per la Misurazione Automatica dei Danni dell'IA Responsabile nelle Applicazioni di IA Generativa

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Abstract

Support