ChatPaper.aiChatPaper

评估亚马逊Nova Premier在“前沿模型安全框架”下的关键风险

Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework

July 7, 2025
作者: Satyapriya Krishna, Ninareh Mehrabi, Abhinav Mohanty, Matteo Memelli, Vincent Ponzo, Payal Motwani, Rahul Gupta
cs.AI

摘要

Nova Premier是亞馬遜最強大的多模態基礎模型,也是模型蒸餾的指導者。它能夠處理文本、圖像和視頻,並擁有一百萬個token的上下文窗口,使得在單一提示下分析大型代碼庫、400頁文檔和90分鐘視頻成為可能。我們首次在邊界模型安全框架下對Nova Premier的關鍵風險概況進行了全面評估。評估針對三個高風險領域——化學、生物、放射與核(CBRN)、進攻性網絡操作以及自動化AI研發——並結合了自動化基準測試、專家紅隊演練和提升研究,以確定該模型是否超出了發布門檻。我們總結了我們的方法並報告了核心發現。基於此次評估,我們認為Nova Premier符合我們在2025年巴黎AI安全峰會上做出的承諾,適合公開發布。隨著與邊界模型相關的新風險和能力的識別,我們將繼續加強我們的安全評估和緩解流程。
English
Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.
PDF31July 10, 2025