AGIの安全性とガバナンスにおけるベストプラクティスに向けて：専門家の意見調査

要旨

OpenAI、Google DeepMind、Anthropicなど、多くの主要なAI企業は、人工汎用知能（AGI）——幅広い認知タスクにおいて人間の性能を達成または超えるAIシステム——の構築を明示的な目標としています。この目標を追求する過程で、特に重大なリスクをもたらす可能性のあるAIシステムを開発・展開するかもしれません。これらの企業はすでにリスクを軽減するためのいくつかの対策を講じていますが、ベストプラクティスはまだ確立されていません。ベストプラクティスの特定を支援するため、私たちはAGI研究所、学界、市民社会から92名の主要な専門家にアンケートを送り、51件の回答を得ました。参加者には、AGI研究所がすべきことに関する50のステートメントにどの程度同意するかを尋ねました。主な発見は、参加者が平均してすべてのステートメントに同意したことです。多くのステートメントは非常に高い同意率を示しました。例えば、回答者の98%が、AGI研究所は展開前のリスク評価、危険な能力の評価、第三者によるモデル監査、モデル使用に対する安全制限、レッドチーミングを実施すべきであると「やや同意」または「強く同意」しました。最終的に、私たちのステートメントリストは、AGI研究所のためのベストプラクティス、標準、規制を開発する取り組みの有益な基盤として役立つ可能性があります。

English

A number of leading AI companies, including OpenAI, Google DeepMind, and Anthropic, have the stated goal of building artificial general intelligence (AGI) - AI systems that achieve or exceed human performance across a wide range of cognitive tasks. In pursuing this goal, they may develop and deploy AI systems that pose particularly significant risks. While they have already taken some measures to mitigate these risks, best practices have not yet emerged. To support the identification of best practices, we sent a survey to 92 leading experts from AGI labs, academia, and civil society and received 51 responses. Participants were asked how much they agreed with 50 statements about what AGI labs should do. Our main finding is that participants, on average, agreed with all of them. Many statements received extremely high levels of agreement. For example, 98% of respondents somewhat or strongly agreed that AGI labs should conduct pre-deployment risk assessments, dangerous capabilities evaluations, third-party model audits, safety restrictions on model usage, and red teaming. Ultimately, our list of statements may serve as a helpful foundation for efforts to develop best practices, standards, and regulations for AGI labs.

AGIの安全性とガバナンスにおけるベストプラクティスに向けて：専門家の意見調査

Towards best practices in AGI safety and governance: A survey of expert opinion

要旨

Support