理解與預測GitHub上有害對話中的脫軌現象
Understanding and Predicting Derailment in Toxic Conversations on GitHub
March 4, 2025
作者: Mia Mohammad Imran, Robert Zita, Rebekah Copeland, Preetha Chatterjee, Rahat Rizvi Rahman, Kostadin Damevski
cs.AI
摘要
軟體專案的蓬勃發展有賴於來自不同背景的個人參與和貢獻。然而,有毒的言論和負面的互動可能會阻礙貢獻者的參與和留存,並使新加入者感到疏離。主動的調節策略旨在通過處理偏離初衷的對話來防止毒性的發生。本研究旨在理解和預測導致GitHub上對話毒性化的對話偏離現象。
為了促進這項研究,我們整理了一個新穎的數據集,包含202個來自GitHub的有毒對話,並標註了偏離點,以及696個非毒性對話作為基準。基於此數據集,我們識別了有毒對話和偏離點的獨特特徵,包括語言標記如第二人稱代詞、否定詞語,以及苦澀挫敗和不耐煩的語氣,還有專案貢獻者與外部參與者之間對話動態的模式。
利用這些實證觀察,我們提出了一種主動調節方法,旨在自動檢測並處理潛在有害的對話,防止其升級。通過運用現代大型語言模型(LLMs),我們開發了一種對話軌跡摘要技術,捕捉討論的演變並識別偏離的早期跡象。我們的實驗表明,針對GitHub對話提供摘要的LLM提示在預測對話偏離方面達到了69%的F1分數,相較於一系列基準方法有顯著提升。
English
Software projects thrive on the involvement and contributions of individuals
from different backgrounds. However, toxic language and negative interactions
can hinder the participation and retention of contributors and alienate
newcomers. Proactive moderation strategies aim to prevent toxicity from
occurring by addressing conversations that have derailed from their intended
purpose. This study aims to understand and predict conversational derailment
leading to toxicity on GitHub.
To facilitate this research, we curate a novel dataset comprising 202 toxic
conversations from GitHub with annotated derailment points, along with 696
non-toxic conversations as a baseline. Based on this dataset, we identify
unique characteristics of toxic conversations and derailment points, including
linguistic markers such as second-person pronouns, negation terms, and tones of
Bitter Frustration and Impatience, as well as patterns in conversational
dynamics between project contributors and external participants.
Leveraging these empirical observations, we propose a proactive moderation
approach to automatically detect and address potentially harmful conversations
before escalation. By utilizing modern LLMs, we develop a conversation
trajectory summary technique that captures the evolution of discussions and
identifies early signs of derailment. Our experiments demonstrate that LLM
prompts tailored to provide summaries of GitHub conversations achieve 69%
F1-Score in predicting conversational derailment, strongly improving over a set
of baseline approaches.Summary
AI-Generated Summary