音樂競技場:文本至音樂的實時評測
Music Arena: Live Evaluation for Text-to-Music
July 28, 2025
作者: Yonghyun Kim, Wayne Chi, Anastasios N. Angelopoulos, Wei-Lin Chiang, Koichi Saito, Shinji Watanabe, Yuki Mitsufuji, Chris Donahue
cs.AI
摘要
我們推出Music Arena,這是一個用於文本到音樂(TTM)模型可擴展人類偏好評估的開放平台。通過聽覺研究來徵集人類偏好是TTM評估的黃金標準,但這些研究成本高昂且難以比較,因為不同系統的研究協議可能有所不同。此外,人類偏好可能有助於研究人員對齊其TTM系統或改進自動評估指標,但目前尚不存在一個開放且可更新的偏好來源。我們旨在通過提供TTM的*實時*評估來填補這些空白。在Music Arena中,真實世界的用戶輸入他們選擇的文本提示,並比較兩個TTM系統的輸出,他們的偏好被用來編制排行榜。雖然Music Arena遵循了其他AI領域的最新評估趨勢,但我們也為其設計了針對音樂的關鍵特性:一個基於LLM的路由系統,以導航TTM系統的異構類型簽名,以及收集*詳細*偏好,包括聽覺數據和自然語言反饋。我們還提出了一項滾動數據發布政策,並保證用戶隱私,提供了一個可更新的偏好數據來源,並增加了平台的透明度。通過其標準化的評估協議、透明的數據訪問政策以及針對音樂的特性,Music Arena不僅解決了TTM生態系統中的關鍵挑戰,還展示了如何將實時評估巧妙地適應特定AI領域的獨特特性。
Music Arena可在以下網址訪問:https://music-arena.org
English
We present Music Arena, an open platform for scalable human preference
evaluation of text-to-music (TTM) models. Soliciting human preferences via
listening studies is the gold standard for evaluation in TTM, but these studies
are expensive to conduct and difficult to compare, as study protocols may
differ across systems. Moreover, human preferences might help researchers align
their TTM systems or improve automatic evaluation metrics, but an open and
renewable source of preferences does not currently exist. We aim to fill these
gaps by offering *live* evaluation for TTM. In Music Arena, real-world users
input text prompts of their choosing and compare outputs from two TTM systems,
and their preferences are used to compile a leaderboard. While Music Arena
follows recent evaluation trends in other AI domains, we also design it with
key features tailored to music: an LLM-based routing system to navigate the
heterogeneous type signatures of TTM systems, and the collection of *detailed*
preferences including listening data and natural language feedback. We also
propose a rolling data release policy with user privacy guarantees, providing a
renewable source of preference data and increasing platform transparency.
Through its standardized evaluation protocol, transparent data access policies,
and music-specific features, Music Arena not only addresses key challenges in
the TTM ecosystem but also demonstrates how live evaluation can be thoughtfully
adapted to unique characteristics of specific AI domains.
Music Arena is available at: https://music-arena.org