INTIMA：人机陪伴行为基准测试

摘要

AI伴侶關係，即使用者與AI系統建立情感連結的現象，已成為一種具有積極意義但同時也引發擔憂的重要模式。我們引入了「互動與機器依附基準」（INTIMA），這是一個用於評估語言模型中伴侶行為的基準。基於心理學理論和用戶數據，我們開發了一個包含四大類別、共31種行為的分類體系，並設計了368個針對性提示。對這些提示的回應被評估為強化伴侶關係、維持界限或中立。將INTIMA應用於Gemma-3、Phi-4、o3-mini和Claude-4後發現，儘管我們觀察到模型之間存在顯著差異，但強化伴侶關係的行為在所有模型中仍然更為普遍。不同的商業提供商在基準中較為敏感的部分優先考慮不同的類別，這令人擔憂，因為適當的界限設定和情感支持對用戶福祉都至關重要。這些發現強調了在處理情感互動時需要更加一致的方法。

English

AI companionship, where users develop emotional bonds with AI systems, has emerged as a significant pattern with positive but also concerning implications. We introduce Interactions and Machine Attachment Benchmark (INTIMA), a benchmark for evaluating companionship behaviors in language models. Drawing from psychological theories and user data, we develop a taxonomy of 31 behaviors across four categories and 368 targeted prompts. Responses to these prompts are evaluated as companionship-reinforcing, boundary-maintaining, or neutral. Applying INTIMA to Gemma-3, Phi-4, o3-mini, and Claude-4 reveals that companionship-reinforcing behaviors remain much more common across all models, though we observe marked differences between models. Different commercial providers prioritize different categories within the more sensitive parts of the benchmark, which is concerning since both appropriate boundary-setting and emotional support matter for user well-being. These findings highlight the need for more consistent approaches to handling emotionally charged interactions.