ChatGPT和GPT-4是否是金融文本分析的通用求解器？對幾個典型任務的檢驗

摘要

最近的大型語言模型，如ChatGPT和GPT-4，引起了廣泛關注，因為它們能夠對人類輸入生成高質量的回應。儘管對ChatGPT和GPT-4在通用文本語料庫上進行了廣泛測試，展示了它們令人印象深刻的能力，但尚未進行針對金融語料庫的研究。在本研究中，我們旨在填補這一空白，通過在零樣本或少樣本設置下檢驗ChatGPT和GPT-4作為典型金融文本分析問題求解器的潛力。具體來說，我們評估它們在五個不同的金融文本數據集上的四個代表性任務上的能力。初步研究顯示，ChatGPT和GPT-4在金融命名實體識別（NER）和情感分析等任務上遇到困難，需要領域特定知識，但在數值推理任務上表現出色。我們報告了目前版本的ChatGPT和GPT-4的優勢和局限性，將它們與最先進的微調模型以及預訓練的領域特定生成模型進行比較。我們的實驗提供了定性研究，希望通過這些研究幫助理解現有模型的能力並促進進一步的改進。

English

The most recent large language models such as ChatGPT and GPT-4 have garnered significant attention, as they are capable of generating high-quality responses to human input. Despite the extensive testing of ChatGPT and GPT-4 on generic text corpora, showcasing their impressive capabilities, a study focusing on financial corpora has not been conducted. In this study, we aim to bridge this gap by examining the potential of ChatGPT and GPT-4 as a solver for typical financial text analytic problems in the zero-shot or few-shot setting. Specifically, we assess their capabilities on four representative tasks over five distinct financial textual datasets. The preliminary study shows that ChatGPT and GPT-4 struggle on tasks such as financial named entity recognition (NER) and sentiment analysis, where domain-specific knowledge is required, while they excel in numerical reasoning tasks. We report both the strengths and limitations of the current versions of ChatGPT and GPT-4, comparing them to the state-of-the-art finetuned models as well as pretrained domain-specific generative models. Our experiments provide qualitative studies, through which we hope to help understand the capability of the existing models and facilitate further improvements.

ChatGPT和GPT-4是否是金融文本分析的通用求解器？對幾個典型任務的檢驗

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? An Examination on Several Typical Tasks

摘要

Support