AudioBERT:音频知识增强语言模型
AudioBERT: Audio Knowledge Augmented Language Model
September 12, 2024
作者: Hyunjong Ok, Suho Yoo, Jaeho Lee
cs.AI
摘要
最近的研究发现,仅在文本数据集上预训练的语言模型通常缺乏基本的视觉知识,例如日常物体的颜色。受到这一观察的启发,我们探讨类似的问题是否存在于听觉知识方面。为了回答这个问题,我们构建了一个名为AuditoryBench的新数据集,其中包含两个用于评估听觉知识的新任务。通过使用基准测试进行分析,我们发现语言模型也存在严重的听觉知识缺乏。为了解决这一局限性,我们提出了一种名为AudioBERT的新方法,通过基于检索的方法增强BERT的听觉知识。首先,我们在提示中检测听觉知识跨度,以便高效地查询我们的检索模型。然后,我们将音频知识注入BERT,并在需要音频知识时开启低秩适应。我们的实验表明,AudioBERT非常有效,在AuditoryBench上取得了优越的性能。数据集和代码可在https://github.com/HJ-Ok/AudioBERT找到。
English
Recent studies have identified that language models, pretrained on text-only
datasets, often lack elementary visual knowledge, e.g., colors of
everyday objects. Motivated by this observation, we ask whether a similar
shortcoming exists in terms of the auditory knowledge. To answer this
question, we construct a new dataset called AuditoryBench, which consists of
two novel tasks for evaluating auditory knowledge. Based on our analysis using
the benchmark, we find that language models also suffer from a severe lack of
auditory knowledge. To address this limitation, we propose AudioBERT, a novel
method to augment the auditory knowledge of BERT through a retrieval-based
approach. First, we detect auditory knowledge spans in prompts to query our
retrieval model efficiently. Then, we inject audio knowledge into BERT and
switch on low-rank adaptation for effective adaptation when audio knowledge is
required. Our experiments demonstrate that AudioBERT is quite effective,
achieving superior performance on the AuditoryBench. The dataset and code are
available at https://github.com/HJ-Ok/AudioBERT.Summary
AI-Generated Summary