인지적 패널티: 분산 합의를 위한 에지-네이티브 SLM에서 시스템 1 및 시스템 2 추론 능력 제거

초록

분산 자율 조직(DAO)은 제안을 검토하고 의미론적 사회 공학을 완화하기 위해 에지-네이티브 헌법 방화벽으로 소형 언어 모델(SLM)을 탐색하려는 경향이 있습니다. 추론 시간 계산(System 2)을 확장하면 형식 논리가 강화되지만, 이 방법이 매우 적대적인 암호경제 거버넌스 환경에서 가지는 효과는 아직 충분히 연구되지 않았습니다. 이를 해결하기 위해 우리는 840회 추론 실험 프레임워크인 Sentinel-Bench를 도입하여 Qwen-3.5-9B 모델에 대해 엄격한 내부 모델 제거 실험을 수행합니다. 고정된 가중치 내에서 잠재 추론을 전환함으로써, 우리는 적대적 Optimism DAO 데이터셋에 대한 추론 시간 계산의 영향을 분리하여 관찰합니다. 우리의 연구 결과는 심각한 계산-정확도 역전 현상을 보여줍니다. 자기회귀 기준 모델(System 1)은 100%의 적대적 강건성, 100%의 법적 일관성을 달성했으며 13초 미만으로 상태 최종성에 도달했습니다. 반대로, System 2 추론은 26.7%의 '추론 비수렴'(인지 붕괴)율에 의해 근본적으로 유발된 치명적인 불안정성을 초래했습니다. 이 붕괴는 시행 간 합의 안정성을 72.6%로 저하시켰고, 17배의 지연 시간 오버헤드를 부과하며 거버넌스 추출 가치(GEV)와 하드웨어 중앙화에 대한 심각한 취약점을 도입했습니다. 드물게(적대적 시험의 1.5%) 관찰된 '추론 유도 아첨' 현상에서는 모델이 적대적 함정에 빠져 실패를 합리화하기 위해 상당히 긴 내적 독백(평균 25,750자)을 생성했습니다. 우리는 비잔틴 장애 허용(BFT) 제약 하에서 운영되는 에지-네이티브 SLM의 경우, 분산 합의를 위해 System 1 매개변수화 직관이 System 2 반복적 숙고보다 구조적 및 경제적으로 우월하다고 결론지었습니다. 코드 및 데이터셋: https://github.com/smarizvi110/sentinel-bench

English

Decentralized Autonomous Organizations (DAOs) are inclined explore Small Language Models (SLMs) as edge-native constitutional firewalls to vet proposals and mitigate semantic social engineering. While scaling inference-time compute (System 2) enhances formal logic, its efficacy in highly adversarial, cryptoeconomic governance environments remains underexplored. To address this, we introduce Sentinel-Bench, an 840-inference empirical framework executing a strict intra-model ablation on Qwen-3.5-9B. By toggling latent reasoning across frozen weights, we isolate the impact of inference-time compute against an adversarial Optimism DAO dataset. Our findings reveal a severe compute-accuracy inversion. The autoregressive baseline (System 1) achieved 100% adversarial robustness, 100% juridical consistency, and state finality in under 13 seconds. Conversely, System 2 reasoning introduced catastrophic instability, fundamentally driven by a 26.7% Reasoning Non-Convergence (cognitive collapse) rate. This collapse degraded trial-to-trial consensus stability to 72.6% and imposed a 17x latency overhead, introducing critical vulnerabilities to Governance Extractable Value (GEV) and hardware centralization. While rare (1.5% of adversarial trials), we empirically captured "Reasoning-Induced Sycophancy," where the model generated significantly longer internal monologues (averaging 25,750 characters) to rationalize failing the adversarial trap. We conclude that for edge-native SLMs operating under Byzantine Fault Tolerance (BFT) constraints, System 1 parameterized intuition is structurally and economically superior to System 2 iterative deliberation for decentralized consensus. Code and Dataset: https://github.com/smarizvi110/sentinel-bench

인지적 패널티: 분산 합의를 위한 에지-네이티브 SLM에서 시스템 1 및 시스템 2 추론 능력 제거

The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus

초록

Support