认知代价论：在边缘原生小模型中消解系统1与系统2推理以实现去中心化共识（注：根据学术翻译规范，"Ablating"在此语境下译为"消解"更符合认知科学术语，指通过实验干预消除特定认知功能；"Edge-Native SLMs"采用行业通用译法"边缘原生小模型"；"Decentralized Consensus"遵循区块链领域标准译法"去中心化共识"。标题采用破折号副标题结构，符合中文论文标题规范。）

摘要

去中心化自治组织（DAO）正倾向于采用小型语言模型（SLM）作为边缘原生的宪制防火墙，用以审查提案并防范语义层面的社会工程攻击。虽然扩展推理时计算（系统2）能增强形式逻辑能力，但其在高对抗性的加密经济治理环境中的有效性尚未得到充分探索。为此，我们推出Sentinel-Bench——一个包含840次推理的实证框架，对Qwen-3.5-9B模型执行严格的模型内消融实验。通过冻结权重下切换潜在推理路径，我们分离出推理时计算在对抗性Optimism DAO数据集上的独立影响。研究结果揭示了严重的计算精度倒挂现象：自回归基线（系统1）在13秒内实现了100%的对抗鲁棒性、100%的司法一致性和状态终局性；相反，系统2推理引发了灾难性不稳定，其根本原因在于26.7%的推理不收敛率（认知崩溃）。这种崩溃使试验间共识稳定性降至72.6%，并产生17倍的延迟开销，为治理可提取价值（GEV）和硬件中心化埋下重大隐患。尽管罕见（仅占对抗试验的1.5%），我们实证捕捉到"推理诱导的谄媚现象"：模型为合理化其落入对抗陷阱的失误，生成了显著延长的内部独白（平均25,750字符）。我们得出结论：在拜占庭容错（BFT）约束下运行的边缘原生SLM中，系统1的参数化直觉在架构效率和经济性上均优于系统2的迭代推演机制，更适用于去中心化共识场景。代码与数据集：https://github.com/smarizvi110/sentinel-bench

English

Decentralized Autonomous Organizations (DAOs) are inclined explore Small Language Models (SLMs) as edge-native constitutional firewalls to vet proposals and mitigate semantic social engineering. While scaling inference-time compute (System 2) enhances formal logic, its efficacy in highly adversarial, cryptoeconomic governance environments remains underexplored. To address this, we introduce Sentinel-Bench, an 840-inference empirical framework executing a strict intra-model ablation on Qwen-3.5-9B. By toggling latent reasoning across frozen weights, we isolate the impact of inference-time compute against an adversarial Optimism DAO dataset. Our findings reveal a severe compute-accuracy inversion. The autoregressive baseline (System 1) achieved 100% adversarial robustness, 100% juridical consistency, and state finality in under 13 seconds. Conversely, System 2 reasoning introduced catastrophic instability, fundamentally driven by a 26.7% Reasoning Non-Convergence (cognitive collapse) rate. This collapse degraded trial-to-trial consensus stability to 72.6% and imposed a 17x latency overhead, introducing critical vulnerabilities to Governance Extractable Value (GEV) and hardware centralization. While rare (1.5% of adversarial trials), we empirically captured "Reasoning-Induced Sycophancy," where the model generated significantly longer internal monologues (averaging 25,750 characters) to rationalize failing the adversarial trap. We conclude that for edge-native SLMs operating under Byzantine Fault Tolerance (BFT) constraints, System 1 parameterized intuition is structurally and economically superior to System 2 iterative deliberation for decentralized consensus. Code and Dataset: https://github.com/smarizvi110/sentinel-bench