大規模言語モデルを用いた生成心理計測に基づく人間とAIの価値の測定

要旨

人間の価値観とその測定は長年にわたり学際的な研究課題となっています。AIの最近の進歩により、この分野への関心が再燃し、大規模言語モデル（LLMs）が価値観の測定のツールおよび対象として台頭しています。本研究では、価値観の測定に関する理論的基盤として、テキストに明示された選択的知覚に基づく、LLMベースのデータ駆動型価値測定パラダイムである「価値観の生成心理測定法（GPV）」を紹介します。我々は、正確な知覚レベルの価値測定のためにLLMを微調整し、LLMがテキストを知覚に解析する能力を検証し、GPVパイプラインの中核を形成します。GPVを人間が執筆したブログに適用することで、その安定性、妥当性、および従来の心理学的ツールに対する優越性を示します。そして、LLMの価値測定にGPVを拡張することで、以下の点で現行技術を進化させます：1）スケーラブルかつ自由形式の出力に基づいてLLMの価値を測定する心理測定法、これにより文脈に応じた測定が可能となる；2）従来の手法の応答バイアスを示す測定パラダイムの比較分析；および3）LLMの価値とその安全性を結びつける試み、異なる価値観の予測力とLLMの安全性への様々な価値の影響を明らかにします。学際的な取り組みを通じて、次世代の心理測定のためにAIを活用し、価値に沿ったAIの実現を目指します。

English

Human values and their measurement are long-standing interdisciplinary inquiry. Recent advances in AI have sparked renewed interest in this area, with large language models (LLMs) emerging as both tools and subjects of value measurement. This work introduces Generative Psychometrics for Values (GPV), an LLM-based, data-driven value measurement paradigm, theoretically grounded in text-revealed selective perceptions. We begin by fine-tuning an LLM for accurate perception-level value measurement and verifying the capability of LLMs to parse texts into perceptions, forming the core of the GPV pipeline. Applying GPV to human-authored blogs, we demonstrate its stability, validity, and superiority over prior psychological tools. Then, extending GPV to LLM value measurement, we advance the current art with 1) a psychometric methodology that measures LLM values based on their scalable and free-form outputs, enabling context-specific measurement; 2) a comparative analysis of measurement paradigms, indicating response biases of prior methods; and 3) an attempt to bridge LLM values and their safety, revealing the predictive power of different value systems and the impacts of various values on LLM safety. Through interdisciplinary efforts, we aim to leverage AI for next-generation psychometrics and psychometrics for value-aligned AI.