认知-偶然不确定性分解的Credal概念瓶颈模型

摘要

概念瓶颈模型(CBMs)通过人类可解释的概念进行预测，但其通常输出的点概念概率会混淆认知不确定性（可缩减的模型欠确定性）与偶然不确定性（不可缩减的输入模糊性）。这导致概念层面的不确定性难以解读，更重要的是难以据此采取行动。我们提出CREDENCE（可信集成概念估计框架），该CBM框架通过结构设计实现概念不确定性的分解。CREDENCE将每个概念表示为可信预测（概率区间），从多样化概念头的分歧中推导认知不确定性，并通过经训练以匹配标注者分歧的专用模糊性输出来估计偶然不确定性。由此产生的信号支持预设决策：自动化处理低不确定性案例，优先收集高认知不确定性案例的数据，将高偶然不确定性案例转交人工审核，并在两类不确定性均高时采取弃权策略。在多项任务中的实验表明，认知不确定性与预测误差呈正相关，而偶然不确定性则紧密跟踪标注者分歧，提供了超越误差关联的指导价值。项目实现代码详见：https://github.com/Tankiit/Credal_Sets/tree/ensemble-credal-cbm

English

Concept Bottleneck Models (CBMs) predict through human-interpretable concepts, but they typically output point concept probabilities that conflate epistemic uncertainty (reducible model underspecification) with aleatoric uncertainty (irreducible input ambiguity). This makes concept-level uncertainty hard to interpret and, more importantly, hard to act upon. We introduce CREDENCE (Credal Ensemble Concept Estimation), a CBM framework that decomposes concept uncertainty by construction. CREDENCE represents each concept as a credal prediction (a probability interval), derives epistemic uncertainty from disagreement across diverse concept heads, and estimates aleatoric uncertainty via a dedicated ambiguity output trained to match annotator disagreement when available. The resulting signals support prescriptive decisions: automate low-uncertainty cases, prioritize data collection for high-epistemic cases, route high-aleatoric cases to human review, and abstain when both are high. Across several tasks, we show that epistemic uncertainty is positively associated with prediction errors, whereas aleatoric uncertainty closely tracks annotator disagreement, providing guidance beyond error correlation. Our implementation is available at the following link: https://github.com/Tankiit/Credal_Sets/tree/ensemble-credal-cbm