GitHubにおける有害な会話の脱線の理解と予測

要旨

ソフトウェアプロジェクトは、多様な背景を持つ個人の参加と貢献によって繁栄します。しかし、有害な言語や否定的な相互作用は、貢献者の参加と継続を妨げ、新規参入者を遠ざける可能性があります。積極的なモデレーション戦略は、意図された目的から外れた会話に対処することで、有害性の発生を防ぐことを目指しています。本研究は、GitHub上での会話の脱線が有害性につながるプロセスを理解し、予測することを目的としています。この研究を進めるため、GitHubから202件の有害な会話と、その脱線ポイントを注釈付きで収集した新規データセットを構築しました。また、比較のための696件の非有害な会話も含まれています。このデータセットに基づき、私たちは有害な会話と脱線ポイントの特徴を特定しました。これには、二人称代名詞や否定語、苦々しいフラストレーションや焦りのトーンといった言語的マーカー、およびプロジェクト貢献者と外部参加者間の会話ダイナミクスのパターンが含まれます。これらの実証的観察を活用し、私たちは潜在的に有害な会話をエスカレーションする前に自動的に検出し対処する積極的なモデレーションアプローチを提案します。現代の大規模言語モデル（LLM）を利用して、議論の進化を捉え、脱線の早期兆候を特定する会話軌跡要約技術を開発しました。私たちの実験では、GitHub会話の要約を提供するように調整されたLLMプロンプトが、会話の脱線を予測する際に69%のF1スコアを達成し、一連のベースラインアプローチを大幅に上回ることを示しています。

English

Software projects thrive on the involvement and contributions of individuals from different backgrounds. However, toxic language and negative interactions can hinder the participation and retention of contributors and alienate newcomers. Proactive moderation strategies aim to prevent toxicity from occurring by addressing conversations that have derailed from their intended purpose. This study aims to understand and predict conversational derailment leading to toxicity on GitHub. To facilitate this research, we curate a novel dataset comprising 202 toxic conversations from GitHub with annotated derailment points, along with 696 non-toxic conversations as a baseline. Based on this dataset, we identify unique characteristics of toxic conversations and derailment points, including linguistic markers such as second-person pronouns, negation terms, and tones of Bitter Frustration and Impatience, as well as patterns in conversational dynamics between project contributors and external participants. Leveraging these empirical observations, we propose a proactive moderation approach to automatically detect and address potentially harmful conversations before escalation. By utilizing modern LLMs, we develop a conversation trajectory summary technique that captures the evolution of discussions and identifies early signs of derailment. Our experiments demonstrate that LLM prompts tailored to provide summaries of GitHub conversations achieve 69% F1-Score in predicting conversational derailment, strongly improving over a set of baseline approaches.

GitHubにおける有害な会話の脱線の理解と予測

Understanding and Predicting Derailment in Toxic Conversations on GitHub

要旨

Support