PreScam: 초기 대화로부터 사기 진행 예측을 위한 벤치마크

초록

대화형 사기, 예를 들어 로맨스 사기나 투자 사기는 온라인 사기의 주요 형태로 부상하고 있다. 가짜 복권이나 미납 통행료 메시지와 같은 일회성 사기 유인책과 달리, 이들은 다회차 대화를 통해 전개되며, 사기범은 진화하는 심리적 기술을 사용하여 피해자를 점진적으로 조종한다. 그러나 기존 연구는 주로 정적 사기 탐지나 합성 사기에 초점을 맞추고 있어, 언어 모델이 실제 사기가 시간에 따라 어떻게 진행되는지를 이해할 수 있는지에 대한 연구는 부족하다. 본 논문에서는 초기 대화부터 사기 진행을 모델링하기 위한 벤치마크인 PreScam을 소개한다. 사용자가 제출한 사기 신고를 기반으로 구축된 PreScam은 177,989건의 원시 신고를 필터링하고 구조화하여 20개의 사기 범주에 걸친 11,573건의 대화형 사기 인스턴스를 생성한다. 각 인스턴스는 제안된 사기 킬 체인에 의해 정의된 사기 생애주기에 따라 계층적으로 구조화되며, 추가로 턴 수준에서 사기범의 심리적 행동과 피해자 반응으로 주석 처리된다. 우리는 두 가지 작업에 대해 모델을 평가한다: 실시간 종료 예측(대화가 종료 단계에 접근하고 있는지 추정)과 사기범 행동 예측(사기범의 후속 행동 예측). 결과는 표면적 유창성과 진행 모델링 사이에 명확한 격차를 보여준다: 지도 학습 인코더는 실시간 종료 예측에서 제로샷 LLM을 크게 능가하는 반면, 다음 행동 예측은 강력한 LLM조차도 중간 정도의 성공에 그친다. 종합하면, 이러한 결과는 현재 모델이 일부 사기 관련 단서를 포착할 수 있지만, 위험이 어떻게 확대되고 조종이 턴을 거쳐 어떻게 전개되는지 추적하는 데는 여전히 어려움을 겪고 있음을 보여준다.

English

Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.