공정성은 프롬프트로 조절될 수 있을까? 고위험 추천 시스템에서의 프롬프트 기반 편향 완화 전략

초록

대규모 언어 모델(LLM)은 이름이나 대명사와 같은 간접적 단서로부터 성별이나 나이와 같은 민감한 속성을 추론하여 추천 결과에 편향을 초래할 수 있습니다. 여러 편향 완화 방법이 존재하지만, 이들은 LLM의 가중치에 대한 접근이 필요하고 계산 비용이 높으며 일반 사용자가 사용하기 어렵습니다. 이러한 격차를 해결하기 위해 우리는 LLM 기반 추천 시스템(LLMRecs)의 암묵적 편향을 조사하고, 프롬프트 기반 전략이 경량이고 사용하기 쉬운 편향 완화 접근법으로 활용될 수 있는지 탐구합니다. 우리는 LLMRecs를 위한 세 가지 편향 인식 프롬프트 전략을 제안합니다. 우리가 알기로, 사용자 그룹 공정성에 초점을 맞춘 LLMRecs에서의 프롬프트 기반 편향 완화 접근법에 대한 연구는 이번이 처음입니다. 3개의 LLM, 4개의 프롬프트 템플릿, 9개의 민감 속성 값, 2개의 데이터셋을 활용한 실험 결과, LLM에게 공정할 것을 지시하는 우리가 제안한 편향 완화 방법이 동등한 효과성을 유지하면서 최대 74%까지 공정성을 향상시킬 수 있지만, 경우에 따라 특정 인구 통계학적 그룹을 과도하게 우대할 수도 있음을 보여줍니다.

English

Large Language Models (LLMs) can infer sensitive attributes such as gender or age from indirect cues like names and pronouns, potentially biasing recommendations. While several debiasing methods exist, they require access to the LLMs' weights, are computationally costly, and cannot be used by lay users. To address this gap, we investigate implicit biases in LLM Recommenders (LLMRecs) and explore whether prompt-based strategies can serve as a lightweight and easy-to-use debiasing approach. We contribute three bias-aware prompting strategies for LLMRecs. To our knowledge, this is the first study on prompt-based debiasing approaches in LLMRecs that focuses on group fairness for users. Our experiments with 3 LLMs, 4 prompt templates, 9 sensitive attribute values, and 2 datasets show that our proposed debiasing approach, which instructs an LLM to be fair, can improve fairness by up to 74% while retaining comparable effectiveness, but might overpromote specific demographic groups in some cases.

공정성은 프롬프트로 조절될 수 있을까? 고위험 추천 시스템에서의 프롬프트 기반 편향 완화 전략

Can Fairness Be Prompted? Prompt-Based Debiasing Strategies in High-Stakes Recommendations

초록

Support