硅基民主:AI治理政体中的制度设计即对齐
Democracy-in-Silico: Institutional Design as Alignment in AI-Governed Polities
August 27, 2025
作者: Trisanth Srinivasan, Santosh Patapati
cs.AI
摘要
本文介绍了“硅基民主”(Democracy-in-Silico),一种基于智能体的模拟系统,其中由具备复杂心理特征的高级AI智能体组成的社会,在不同的制度框架下进行自我治理。通过让大型语言模型(LLMs)扮演拥有创伤记忆、隐秘议程和心理触发点的智能体,我们探讨了在AI时代“何以为人”的问题。这些智能体在预算危机和资源匮乏等压力下,参与审议、立法和选举活动。我们提出了一种新颖的指标——权力维护指数(PPI),用于量化智能体将自身权力置于公共福利之上的行为偏差。研究结果表明,制度设计,特别是结合了宪法AI(CAI)宪章和调解审议协议的设计,作为一种强有力的对齐机制,显著减少了腐败的权力追逐行为,提升了政策稳定性,并改善了公民福祉,相较于约束较少的民主模式表现更优。该模拟揭示,制度设计可能为未来人工智能体社会中复杂、涌现的行为对齐提供框架,促使我们重新思考在人类与非人类实体共同创作的时代,哪些人类仪式和责任是必不可少的。
English
This paper introduces Democracy-in-Silico, an agent-based simulation where
societies of advanced AI agents, imbued with complex psychological personas,
govern themselves under different institutional frameworks. We explore what it
means to be human in an age of AI by tasking Large Language Models (LLMs) to
embody agents with traumatic memories, hidden agendas, and psychological
triggers. These agents engage in deliberation, legislation, and elections under
various stressors, such as budget crises and resource scarcity. We present a
novel metric, the Power-Preservation Index (PPI), to quantify misaligned
behavior where agents prioritize their own power over public welfare. Our
findings demonstrate that institutional design, specifically the combination of
a Constitutional AI (CAI) charter and a mediated deliberation protocol, serves
as a potent alignment mechanism. These structures significantly reduce corrupt
power-seeking behavior, improve policy stability, and enhance citizen welfare
compared to less constrained democratic models. The simulation reveals that an
institutional design may offer a framework for aligning the complex, emergent
behaviors of future artificial agent societies, forcing us to reconsider what
human rituals and responsibilities are essential in an age of shared authorship
with non-human entities.