Open-FinLLMs: Open Multimodale Grote Taalmodellen voor Financiële Toepassingen

Samenvatting

Grote taalmodellen (LLMs) hebben financiële toepassingen vooruitgeholpen, maar ze beschikken vaak niet over voldoende financiële kennis en hebben moeite met taken die multi-modale invoer zoals tabellen en tijdreeksgegevens vereisen. Om deze beperkingen aan te pakken, introduceren we Open-FinLLMs, een reeks Financiële LLMs. We beginnen met FinLLaMA, voorgetraind op een financieel corpus van 52 miljard tokens, waarin tekst, tabellen en tijdreeksgegevens zijn opgenomen om uitgebreide financiële kennis in te bedden. FinLLaMA wordt vervolgens instructie-fijn afgestemd met 573K financiële instructies, wat resulteert in FinLLaMA-instruct, dat de taakprestaties verbetert. Ten slotte presenteren we FinLLaVA, een multimodaal LLM getraind met 1.43M beeld-tekst instructies om complexe financiële gegevenstypen te verwerken. Uitgebreide evaluaties tonen aan dat FinLLaMA superieure prestaties levert ten opzichte van LLaMA3-8B, LLaMA3.1-8B en BloombergGPT in zowel zero-shot als few-shot instellingen over respectievelijk 19 en 4 datasets. FinLLaMA-instruct presteert beter dan GPT-4 en andere Financiële LLMs op 15 datasets. FinLLaVA blinkt uit in het begrijpen van tabellen en grafieken over 4 multimodale taken. Daarnaast behaalt FinLLaMA indrukwekkende Sharpe Ratio's in handelssimulaties, wat zijn robuuste financiële toepassingsmogelijkheden benadrukt. We zullen onze modellen en benchmarks voortdurend onderhouden en verbeteren om doorlopende innovatie in de academische wereld en de industrie te ondersteunen.

English

Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce Open-FinLLMs, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, tables, and time-series data to embed comprehensive financial knowledge. FinLLaMA is then instruction fine-tuned with 573K financial instructions, resulting in FinLLaMA-instruct, which enhances task performance. Finally, we present FinLLaVA, a multimodal LLM trained with 1.43M image-text instructions to handle complex financial data types. Extensive evaluations demonstrate FinLLaMA's superior performance over LLaMA3-8B, LLaMA3.1-8B, and BloombergGPT in both zero-shot and few-shot settings across 19 and 4 datasets, respectively. FinLLaMA-instruct outperforms GPT-4 and other Financial LLMs on 15 datasets. FinLLaVA excels in understanding tables and charts across 4 multimodal tasks. Additionally, FinLLaMA achieves impressive Sharpe Ratios in trading simulations, highlighting its robust financial application capabilities. We will continually maintain and improve our models and benchmarks to support ongoing innovation in academia and industry.

Open-FinLLMs: Open Multimodale Grote Taalmodellen voor Financiële Toepassingen

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Samenvatting

Summary

Support

Support