PreScam: 初期の会話から詐欺進行を予測するためのベンチマーク

要旨

ロマンス詐欺や投資詐欺などの会話型詐欺は、オンライン詐欺の主要な形態として台頭している。偽宝くじや未払い料金メッセージといった一回限りの詐欺誘引とは異なり、これらは複数回の会話を通じて展開され、詐欺師は進化する心理的手法を用いて徐々に被害者を操作する。しかし、既存研究は主に静的な詐欺検知や合成詐欺に焦点を当てており、言語モデルが実際の詐欺が時間とともにどのように進行するかを理解できるかどうかは未解明である。我々は、初期の会話から詐欺進行をモデル化するためのベンチマークであるプレスキャム（PreScam）を導入する。ユーザー提出の詐欺報告から構築されたプレスキャムは、177,989件の生報告をフィルタリングおよび構造化し、20の詐欺カテゴリにわたる11,573件の会話型詐欺インスタンスとする。各インスタンスは、提案された詐欺キルチェーンによって定義される詐欺ライフサイクルに従って階層的に構造化され、さらにターンレベルで詐欺師の心理的行動と被害者の応答がアノテーションされる。我々は2つのタスクでモデルをベンチマークする。すなわち、会話が終了段階に近づいているかを推定するリアルタイム終了予測と、詐欺師の次の行動を予測する詐欺師行動予測である。結果は、表面的な流暢さと進行モデリングの間に明確なギャップを示している。すなわち、教師ありエンコーダがリアルタイム終了予測においてゼロショットLLMを大幅に上回る一方、次の行動予測は強力なLLMでも中程度の成功にとどまる。総合すると、これらの結果は、現在のモデルが詐欺関連の手がかりを捉えることはできるものの、リスクがどのように高まり、操作がターン間でどのように展開するかを追跡することには依然として苦戦していることを示している。

English

Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.