音频时空融合的自适应证据加权
Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
February 3, 2026
作者: Oscar Ovanger, Levi Harris, Timothy H. Keitt
cs.AI
摘要
许多机器学习系统虽能获取针对同一预测目标的多源证据,但这些证据源在不同输入中的可靠性和信息量往往存在差异。在生物声学分类中,物种识别既可基于声学信号,也可借助时空背景(如地理位置和季节)进行推断;虽然贝叶斯推断支持采用乘法证据组合策略,但实践中我们通常只能使用判别式预测器而非经过校准的生成模型。本文提出独立条件假设下的融合框架(FINCH),该自适应对数线性证据融合框架将预训练的音频分类器与结构化时空预测器相集成。FINCH通过学习样本级门控函数,从不确定性和信息量统计特征中估计上下文信息的可靠性。由此构建的融合族将纯音频分类器作为特例包含其中,并显式约束上下文证据的影响范围,形成具有可解释纯音频回退机制的风险受控假设类。在多项基准测试中,FINCH始终优于固定权重融合和纯音频基线方法,即使上下文信息单独作用较弱时,仍能提升鲁棒性并改善误差权衡。通过轻量化、可解释的证据驱动方法,我们在CBI数据集上实现了最先进性能,并在BirdSet的多个子集上取得竞争性或更优的结果。代码已开源:\href{https://anonymous.4open.science/r/birdnoise-85CD/README.md}{匿名代码库}
English
Many machine learning systems have access to multiple sources of evidence for the same prediction target, yet these sources often differ in reliability and informativeness across inputs. In bioacoustic classification, species identity may be inferred both from the acoustic signal and from spatiotemporal context such as location and season; while Bayesian inference motivates multiplicative evidence combination, in practice we typically only have access to discriminative predictors rather than calibrated generative models. We introduce Fusion under INdependent Conditional Hypotheses (FINCH), an adaptive log-linear evidence fusion framework that integrates a pre-trained audio classifier with a structured spatiotemporal predictor. FINCH learns a per-sample gating function that estimates the reliability of contextual information from uncertainty and informativeness statistics. The resulting fusion family contains the audio-only classifier as a special case and explicitly bounds the influence of contextual evidence, yielding a risk-contained hypothesis class with an interpretable audio-only fallback. Across benchmarks, FINCH consistently outperforms fixed-weight fusion and audio-only baselines, improving robustness and error trade-offs even when contextual information is weak in isolation. We achieve state-of-the-art performance on CBI and competitive or improved performance on several subsets of BirdSet using a lightweight, interpretable, evidence-based approach. Code is available: \href{https://anonymous.4open.science/r/birdnoise-85CD/README.md{anonymous-repository}}