ChatPaper.aiChatPaper

大型語言模型假設人們比實際更理性。

Large Language Models Assume People are More Rational than We Really are

June 24, 2024
作者: Ryan Liu, Jiayi Geng, Joshua C. Peterson, Ilia Sucholutsky, Thomas L. Griffiths
cs.AI

摘要

為了讓人工智慧系統能夠有效地與人溝通,它們必須了解我們做決策的方式。然而,人們的決策並非總是理性的,因此大型語言模型(LLMs)中對人類決策的內隱內部模型必須考慮到這一點。先前的實證證據似乎表明這些內隱模型是準確的 - LLMs提供了可信的人類行為代理,表現出我們在日常互動中預期人類會有的行為。然而,通過將LLM的行為和預測與大量人類決策的數據集進行比較,我們發現事實並非如此:在模擬和預測人們的選擇時,一套尖端的LLMs(GPT-4o和4-Turbo、Llama-3-8B和70B、Claude 3 Opus)假設人們比我們實際上更理性。具體而言,這些模型與人類行為有所偏離,更接近於一種經典的理性選擇模型 - 預期價值理論。有趣的是,人們在解釋他人行為時也傾向於假設其他人是理性的。因此,當我們通過另一個心理數據集比較LLMs和人們從他人決策中得出的推論時,我們發現這些推論高度相關。因此,LLMs的內隱決策模型似乎與人類對其他人會理性行事的期望保持一致,而不是與人們實際行動的方式相符。
English
In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act.

Summary

AI-Generated Summary

PDF44November 29, 2024