FinAudio:金融應用中音訊大型語言模型的基準測試平台
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications
March 26, 2025
作者: Yupeng Cao, Haohang Li, Yangyang Yu, Shashidhar Reddy Javaji, Yueru He, Jimin Huang, Zining Zhu, Qianqian Xie, Xiao-yang Liu, Koduvayur Subbalakshmi, Meikang Qiu, Sophia Ananiadou, Jian-Yun Nie
cs.AI
摘要
音頻大型語言模型(AudioLLMs)已獲得廣泛關注,並在對話、音頻理解及自動語音識別(ASR)等音頻任務上顯著提升了性能。儘管取得了這些進展,目前仍缺乏一個基準來評估AudioLLMs在金融場景中的表現,其中如收益電話會議和CEO演講等音頻數據,是金融分析和投資決策的關鍵資源。本文中,我們介紹了FinAudio,這是首個旨在評估AudioLLMs在金融領域能力的基準。我們首先根據金融領域的獨特特性定義了三項任務:1)短金融音頻的ASR,2)長金融音頻的ASR,以及3)長金融音頻的摘要生成。隨後,我們分別策劃了兩個短音頻和兩個長音頻數據集,並開發了一個新穎的金融音頻摘要數據集,共同構成了FinAudio基準。接著,我們在FinAudio上評估了七種流行的AudioLLMs。我們的評估揭示了現有AudioLLMs在金融領域的局限性,並為改進AudioLLMs提供了見解。所有數據集和代碼將被公開。
English
Audio Large Language Models (AudioLLMs) have received widespread attention
and have significantly improved performance on audio tasks such as
conversation, audio understanding, and automatic speech recognition (ASR).
Despite these advancements, there is an absence of a benchmark for assessing
AudioLLMs in financial scenarios, where audio data, such as earnings conference
calls and CEO speeches, are crucial resources for financial analysis and
investment decisions. In this paper, we introduce FinAudio, the first
benchmark designed to evaluate the capacity of AudioLLMs in the financial
domain. We first define three tasks based on the unique characteristics of the
financial domain: 1) ASR for short financial audio, 2) ASR for long financial
audio, and 3) summarization of long financial audio. Then, we curate two short
and two long audio datasets, respectively, and develop a novel dataset for
financial audio summarization, comprising the FinAudio benchmark.
Then, we evaluate seven prevalent AudioLLMs on FinAudio. Our
evaluation reveals the limitations of existing AudioLLMs in the financial
domain and offers insights for improving AudioLLMs. All datasets and codes will
be released.Summary
AI-Generated Summary