ChatGPT和GPT-4是否是金融文本分析的通用求解器？对几个典型任务进行检查

摘要

最近的大型语言模型，如ChatGPT和GPT-4，引起了广泛关注，因为它们能够生成高质量的响应以回应人类输入。尽管ChatGPT和GPT-4在通用文本语料库上经过了广泛测试，展示了它们令人印象深刻的能力，但尚未进行针对金融语料库的研究。在本研究中，我们旨在填补这一空白，通过在零样本或少样本设置下检验ChatGPT和GPT-4作为典型金融文本分析问题的解决器的潜力。具体而言，我们评估它们在五个不同的金融文本数据集上的四个代表性任务的能力。初步研究显示，ChatGPT和GPT-4在金融命名实体识别（NER）和情感分析等任务上表现不佳，这些任务需要领域特定知识，而在数值推理任务上表现出色。我们报告了当前版本的ChatGPT和GPT-4的优势和局限性，将它们与最先进的微调模型以及预训练的领域特定生成模型进行了比较。我们的实验提供了定性研究，希望通过这些研究帮助理解现有模型的能力并促进进一步的改进。

English

The most recent large language models such as ChatGPT and GPT-4 have garnered significant attention, as they are capable of generating high-quality responses to human input. Despite the extensive testing of ChatGPT and GPT-4 on generic text corpora, showcasing their impressive capabilities, a study focusing on financial corpora has not been conducted. In this study, we aim to bridge this gap by examining the potential of ChatGPT and GPT-4 as a solver for typical financial text analytic problems in the zero-shot or few-shot setting. Specifically, we assess their capabilities on four representative tasks over five distinct financial textual datasets. The preliminary study shows that ChatGPT and GPT-4 struggle on tasks such as financial named entity recognition (NER) and sentiment analysis, where domain-specific knowledge is required, while they excel in numerical reasoning tasks. We report both the strengths and limitations of the current versions of ChatGPT and GPT-4, comparing them to the state-of-the-art finetuned models as well as pretrained domain-specific generative models. Our experiments provide qualitative studies, through which we hope to help understand the capability of the existing models and facilitate further improvements.

ChatGPT和GPT-4是否是金融文本分析的通用求解器？对几个典型任务进行检查

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? An Examination on Several Typical Tasks

摘要

Support