评估亚马逊Nova Premier在“前沿模型安全框架”下的关键风险
Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework
July 7, 2025
作者: Satyapriya Krishna, Ninareh Mehrabi, Abhinav Mohanty, Matteo Memelli, Vincent Ponzo, Payal Motwani, Rahul Gupta
cs.AI
摘要
Nova Premier是亞馬遜最強大的多模態基礎模型,也是模型蒸餾的指導者。它能夠處理文本、圖像和視頻,並擁有一百萬個token的上下文窗口,使得在單一提示下分析大型代碼庫、400頁文檔和90分鐘視頻成為可能。我們首次在邊界模型安全框架下對Nova Premier的關鍵風險概況進行了全面評估。評估針對三個高風險領域——化學、生物、放射與核(CBRN)、進攻性網絡操作以及自動化AI研發——並結合了自動化基準測試、專家紅隊演練和提升研究,以確定該模型是否超出了發布門檻。我們總結了我們的方法並報告了核心發現。基於此次評估,我們認為Nova Premier符合我們在2025年巴黎AI安全峰會上做出的承諾,適合公開發布。隨著與邊界模型相關的新風險和能力的識別,我們將繼續加強我們的安全評估和緩解流程。
English
Nova Premier is Amazon's most capable multimodal foundation model and teacher
for model distillation. It processes text, images, and video with a
one-million-token context window, enabling analysis of large codebases,
400-page documents, and 90-minute videos in a single prompt. We present the
first comprehensive evaluation of Nova Premier's critical risk profile under
the Frontier Model Safety Framework. Evaluations target three high-risk domains
-- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber
Operations, and Automated AI R&D -- and combine automated benchmarks, expert
red-teaming, and uplift studies to determine whether the model exceeds release
thresholds. We summarize our methodology and report core findings. Based on
this evaluation, we find that Nova Premier is safe for public release as per
our commitments made at the 2025 Paris AI Safety Summit. We will continue to
enhance our safety evaluation and mitigation pipelines as new risks and
capabilities associated with frontier models are identified.