前瞻性人工智慧監管：管理對公共安全的新興風險

摘要

先進的人工智慧模型為人類帶來巨大好處的潛力，但社會需要主動管理相應的風險。在本文中，我們專注於所謂的「前沿人工智慧」模型：高度能力的基礎模型，可能具有足以對公共安全構成嚴重風險的危險能力。前沿人工智慧模型帶來獨特的監管挑戰：危險能力可能出乎意料地出現；很難堅固地防止已部署的模型被誤用；以及，難以阻止模型的能力廣泛擴散。為應對這些挑戰，至少需要三個用於監管前沿模型的基礎：（1）標準制定流程，以確定前沿人工智慧開發者的適當要求；（2）註冊和報告要求，為監管機構提供對前沿人工智慧開發過程的透明度；以及（3）機制，以確保遵守前沿人工智慧模型的安全標準，包括開發和部署。行業自我監管是重要的第一步。然而，需要更廣泛的社會討論和政府干預來建立標準並確保遵守這些標準。我們考慮了幾種選項，包括賦予監管機構執法權力和前沿人工智慧模型的許可制度。最後，我們提出了一套初始的安全標準。這些標準包括進行部署前的風險評估；對模型行為進行外部審查；使用風險評估來指導部署決策；以及監控並回應有關模型能力和部署後使用的新信息。我們希望這次討論有助於更廣泛的對話，探討如何平衡公共安全風險和從人工智慧開發前沿的創新帶來的好處。

English

Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.

前瞻性人工智慧監管：管理對公共安全的新興風險

Frontier AI Regulation: Managing Emerging Risks to Public Safety

摘要

Support