GPTモデルは金融アナリストになれるのか？模擬CFA試験におけるChatGPTとGPT-4の評価

要旨

大規模言語モデル（LLMs）は、幅広い自然言語処理（NLP）タスクにおいて驚異的な性能を発揮し、しばしばタスク特化型の最先端モデルに匹敵するか、それを上回る結果を示しています。本研究は、LLMsの財務推論能力を評価することを目的としています。私たちは、公認金融アナリスト（CFA）プログラムの模擬試験問題を活用し、ChatGPTとGPT-4の財務分析における包括的な評価を、ゼロショット（ZS）、連鎖思考（CoT）、および少数ショット（FS）のシナリオを考慮して実施します。モデルの性能と限界について詳細な分析を行い、CFA試験に合格する可能性があるかどうかを推定します。最後に、LLMsの金融分野での適用性を向上させるための潜在的な戦略と改善策についての洞察を提示します。この観点から、本研究が厳密な評価を通じて財務推論のためのLLMsを継続的に強化するための今後の研究の道を開くことを期待しています。

English

Large Language Models (LLMs) have demonstrated remarkable performance on a wide range of Natural Language Processing (NLP) tasks, often matching or even beating state-of-the-art task-specific models. This study aims at assessing the financial reasoning capabilities of LLMs. We leverage mock exam questions of the Chartered Financial Analyst (CFA) Program to conduct a comprehensive evaluation of ChatGPT and GPT-4 in financial analysis, considering Zero-Shot (ZS), Chain-of-Thought (CoT), and Few-Shot (FS) scenarios. We present an in-depth analysis of the models' performance and limitations, and estimate whether they would have a chance at passing the CFA exams. Finally, we outline insights into potential strategies and improvements to enhance the applicability of LLMs in finance. In this perspective, we hope this work paves the way for future studies to continue enhancing LLMs for financial reasoning through rigorous evaluation.

GPTモデルは金融アナリストになれるのか？模擬CFA試験におけるChatGPTとGPT-4の評価

Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

要旨

Support