ChatGPT e GPT-4 sono risolutori generali per l'analisi di testi finanziari? Un esame su diversi compiti tipici

Abstract

I più recenti modelli linguistici di grandi dimensioni, come ChatGPT e GPT-4, hanno attirato un'attenzione significativa, poiché sono in grado di generare risposte di alta qualità agli input umani. Nonostante i test estensivi di ChatGPT e GPT-4 su corpora testuali generici, che ne hanno dimostrato le impressionanti capacità, uno studio focalizzato su corpora finanziari non è stato ancora condotto. In questo studio, miriamo a colmare questa lacuna esaminando il potenziale di ChatGPT e GPT-4 come risolutori di tipici problemi di analisi testuale finanziaria in contesti zero-shot o few-shot. Nello specifico, valutiamo le loro capacità su quattro task rappresentativi su cinque distinti dataset testuali finanziari. Lo studio preliminare mostra che ChatGPT e GPT-4 incontrano difficoltà in task come il riconoscimento di entità nominate (NER) finanziarie e l'analisi del sentiment, dove è richiesta una conoscenza specifica del dominio, mentre eccellono nei task di ragionamento numerico. Riportiamo sia i punti di forza che le limitazioni delle versioni attuali di ChatGPT e GPT-4, confrontandoli con i modelli finetuned all'avanguardia e con i modelli generativi preaddestrati specifici per il dominio. I nostri esperimenti forniscono studi qualitativi, attraverso i quali speriamo di contribuire a comprendere le capacità dei modelli esistenti e di facilitare ulteriori miglioramenti.

English

The most recent large language models such as ChatGPT and GPT-4 have garnered significant attention, as they are capable of generating high-quality responses to human input. Despite the extensive testing of ChatGPT and GPT-4 on generic text corpora, showcasing their impressive capabilities, a study focusing on financial corpora has not been conducted. In this study, we aim to bridge this gap by examining the potential of ChatGPT and GPT-4 as a solver for typical financial text analytic problems in the zero-shot or few-shot setting. Specifically, we assess their capabilities on four representative tasks over five distinct financial textual datasets. The preliminary study shows that ChatGPT and GPT-4 struggle on tasks such as financial named entity recognition (NER) and sentiment analysis, where domain-specific knowledge is required, while they excel in numerical reasoning tasks. We report both the strengths and limitations of the current versions of ChatGPT and GPT-4, comparing them to the state-of-the-art finetuned models as well as pretrained domain-specific generative models. Our experiments provide qualitative studies, through which we hope to help understand the capability of the existing models and facilitate further improvements.

ChatGPT e GPT-4 sono risolutori generali per l'analisi di testi finanziari? Un esame su diversi compiti tipici

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? An Examination on Several Typical Tasks

Abstract

Support