GPT模型能否成为金融分析师?对ChatGPT和GPT-4在模拟CFA考试中的评估
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
October 12, 2023
作者: Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu, Sameena Shah
cs.AI
摘要
大型语言模型(LLMs)在广泛的自然语言处理(NLP)任务中展现出卓越的性能,通常能与甚至超越最先进的特定任务模型相匹敌。本研究旨在评估LLMs在财务推理方面的能力。我们利用特许金融分析师(CFA)项目的模拟考试题目,对ChatGPT和GPT-4在财务分析领域进行全面评估,考虑零样本(ZS)、思维链(CoT)和少样本(FS)场景。我们对模型的性能和局限性进行了深入分析,并估计它们是否有通过CFA考试的机会。最后,我们概述了潜在策略和改进的见解,以增强LLMs在金融领域的适用性。从这个角度来看,我们希望这项工作为未来的研究铺平道路,继续通过严格评估提升LLMs在财务推理方面的能力。
English
Large Language Models (LLMs) have demonstrated remarkable performance on a
wide range of Natural Language Processing (NLP) tasks, often matching or even
beating state-of-the-art task-specific models. This study aims at assessing the
financial reasoning capabilities of LLMs. We leverage mock exam questions of
the Chartered Financial Analyst (CFA) Program to conduct a comprehensive
evaluation of ChatGPT and GPT-4 in financial analysis, considering Zero-Shot
(ZS), Chain-of-Thought (CoT), and Few-Shot (FS) scenarios. We present an
in-depth analysis of the models' performance and limitations, and estimate
whether they would have a chance at passing the CFA exams. Finally, we outline
insights into potential strategies and improvements to enhance the
applicability of LLMs in finance. In this perspective, we hope this work paves
the way for future studies to continue enhancing LLMs for financial reasoning
through rigorous evaluation.