GitHub에서의 유해한 대화에서의 탈선(derailment) 이해와 예측

초록

소프트웨어 프로젝트는 다양한 배경을 가진 개인들의 참여와 기여를 통해 번성합니다. 그러나 유해한 언어와 부정적인 상호작용은 기여자들의 참여와 유지를 방해하고, 신규 참여자들을 소외시킬 수 있습니다. 사전 조정 전략은 의도된 목적에서 벗어난 대화를 해결함으로써 유해성이 발생하는 것을 방지하는 것을 목표로 합니다. 본 연구는 GitHub에서 독성으로 이어지는 대화의 탈선을 이해하고 예측하는 것을 목표로 합니다. 이 연구를 위해, 우리는 GitHub에서 수집한 202개의 독성 대화와 이들의 탈선 지점을 주석 처리한 데이터셋과 696개의 비독성 대화를 기준으로 한 새로운 데이터셋을 구축했습니다. 이 데이터셋을 기반으로, 우리는 독성 대화와 탈선 지점의 고유한 특성을 식별했습니다. 이 특성에는 2인칭 대명사, 부정어, 그리고 '쓰라린 좌절'과 '조급함'과 같은 어조와 같은 언어적 표지, 그리고 프로젝트 기여자와 외부 참여자 간의 대화 역학 패턴이 포함됩니다. 이러한 경험적 관찰을 활용하여, 우리는 잠재적으로 유해한 대화가 악화되기 전에 자동으로 탐지하고 해결하기 위한 사전 조정 접근 방식을 제안합니다. 최신 대형 언어 모델(LLM)을 활용하여, 우리는 토론의 진화를 포착하고 탈선의 초기 징후를 식별하는 대화 궤적 요약 기술을 개발했습니다. 우리의 실험은 GitHub 대화의 요약을 제공하도록 맞춤화된 LLM 프롬프트가 대화 탈선 예측에서 69%의 F1 점수를 달성하며, 일련의 기준 접근 방식에 비해 크게 개선됨을 보여줍니다.

English

Software projects thrive on the involvement and contributions of individuals from different backgrounds. However, toxic language and negative interactions can hinder the participation and retention of contributors and alienate newcomers. Proactive moderation strategies aim to prevent toxicity from occurring by addressing conversations that have derailed from their intended purpose. This study aims to understand and predict conversational derailment leading to toxicity on GitHub. To facilitate this research, we curate a novel dataset comprising 202 toxic conversations from GitHub with annotated derailment points, along with 696 non-toxic conversations as a baseline. Based on this dataset, we identify unique characteristics of toxic conversations and derailment points, including linguistic markers such as second-person pronouns, negation terms, and tones of Bitter Frustration and Impatience, as well as patterns in conversational dynamics between project contributors and external participants. Leveraging these empirical observations, we propose a proactive moderation approach to automatically detect and address potentially harmful conversations before escalation. By utilizing modern LLMs, we develop a conversation trajectory summary technique that captures the evolution of discussions and identifies early signs of derailment. Our experiments demonstrate that LLM prompts tailored to provide summaries of GitHub conversations achieve 69% F1-Score in predicting conversational derailment, strongly improving over a set of baseline approaches.

GitHub에서의 유해한 대화에서의 탈선(derailment) 이해와 예측

Understanding and Predicting Derailment in Toxic Conversations on GitHub

초록

Support