Open-FinLLMs:開放式金融應用的開放式多模式大型語言模型
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
August 20, 2024
作者: Qianqian Xie, Dong Li, Mengxi Xiao, Zihao Jiang, Ruoyu Xiang, Xiao Zhang, Zhengyu Chen, Yueru He, Weiguang Han, Yuzhe Yang, Shunian Chen, Yifei Zhang, Lihang Shen, Daniel Kim, Zhiwei Liu, Zheheng Luo, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Zhiyuan Yao, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram, Peng Lu, Yilun Zhao, Yitao Long, Guojun Xiong, Kaleb Smith, Honghai Yu, Yanzhao Lai, Min Peng, Jianyun Nie, Jordan W. Suchow, Xiao-Yang Liu, Benyou Wang, Alejandro Lopez-Lira, Jimin Huang, Sophia Ananiadou
cs.AI
摘要
大型語言模型(LLMs)已廣泛應用於金融領域,但通常缺乏足夠的金融知識,並且在涉及表格和時間序列數據等多模態輸入的任務中遇到困難。為解決這些限制,我們引入了Open-FinLLMs,一系列金融LLMs。我們首先推出了FinLLaMA,它在一個包含520億令牌的金融語料庫上預訓練,並結合文本、表格和時間序列數據以嵌入全面的金融知識。然後,對FinLLaMA進行573K金融指令的指導微調,得到了增強任務性能的FinLLaMA-instruct。最後,我們提出了FinLLaVA,這是一個多模態LLM,通過1.43M圖像文本指令進行訓練,以處理複雜的金融數據類型。廣泛的評估顯示,FinLLaMA在19個和4個數據集的零樣本和少樣本設置中優於LLaMA3-8B、LLaMA3.1-8B和BloombergGPT。FinLLaMA-instruct在15個數據集上優於GPT-4和其他金融LLMs。FinLLaVA在4個多模態任務中擅長理解表格和圖表。此外,FinLLaMA在交易模擬中實現了令人印象深刻的夏普比率,突顯了其強大的金融應用能力。我們將持續維護和改進我們的模型和基準,以支持學術界和工業界持續創新。
English
Large language models (LLMs) have advanced financial applications, yet they
often lack sufficient financial knowledge and struggle with tasks involving
multi-modal inputs like tables and time series data. To address these
limitations, we introduce Open-FinLLMs, a series of Financial LLMs. We
begin with FinLLaMA, pre-trained on a 52 billion token financial corpus,
incorporating text, tables, and time-series data to embed comprehensive
financial knowledge. FinLLaMA is then instruction fine-tuned with 573K
financial instructions, resulting in FinLLaMA-instruct, which enhances task
performance. Finally, we present FinLLaVA, a multimodal LLM trained with 1.43M
image-text instructions to handle complex financial data types. Extensive
evaluations demonstrate FinLLaMA's superior performance over LLaMA3-8B,
LLaMA3.1-8B, and BloombergGPT in both zero-shot and few-shot settings across 19
and 4 datasets, respectively. FinLLaMA-instruct outperforms GPT-4 and other
Financial LLMs on 15 datasets. FinLLaVA excels in understanding tables and
charts across 4 multimodal tasks. Additionally, FinLLaMA achieves impressive
Sharpe Ratios in trading simulations, highlighting its robust financial
application capabilities. We will continually maintain and improve our models
and benchmarks to support ongoing innovation in academia and industry.Summary
AI-Generated Summary