ChatPaper.aiChatPaper

MusicAgent:一個運用大型語言模型進行音樂理解和生成的人工智慧代理程式

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

October 18, 2023
作者: Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian
cs.AI

摘要

AI增強音樂處理是一個多元的領域,涵蓋數十種任務,從生成任務(例如音色合成)到理解任務(例如音樂分類)。對於開發人員和業餘愛好者來說,很難掌握所有這些任務,以滿足他們在音樂處理方面的需求,特別是考慮到音樂數據的表示方式和模型在各種任務之間跨平台的應用的巨大差異。因此,有必要建立一個系統來組織和整合這些任務,從而幫助從業者自動分析他們的需求並調用合適的工具作為解決方案來滿足他們的需求。受到大型語言模型(LLMs)在任務自動化方面的最近成功的啟發,我們開發了一個名為MusicAgent的系統,該系統集成了眾多與音樂相關的工具和自主工作流程,以滿足用戶需求。更具體地,我們建立了1)從各種來源收集工具的工具集,包括Hugging Face、GitHub和Web API等。2)由LLMs(例如ChatGPT)賦能的自主工作流程,用於組織這些工具並自動將用戶請求分解為多個子任務並調用相應的音樂工具。該系統的主要目標是使用戶擺脫AI音樂工具的複雜性,使他們能夠專注於創造性方面。通過賦予用戶輕鬆組合工具的自由,該系統提供了一種無縫而豐富的音樂體驗。
English
AI-empowered music processing is a diverse field that encompasses dozens of tasks, ranging from generation tasks (e.g., timbre synthesis) to comprehension tasks (e.g., music classification). For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks. Consequently, it is necessary to build a system to organize and integrate these tasks, and thus help practitioners to automatically analyze their demand and call suitable tools as solutions to fulfill their requirements. Inspired by the recent success of large language models (LLMs) in task automation, we develop a system, named MusicAgent, which integrates numerous music-related tools and an autonomous workflow to address user requirements. More specifically, we build 1) toolset that collects tools from diverse sources, including Hugging Face, GitHub, and Web API, etc. 2) an autonomous workflow empowered by LLMs (e.g., ChatGPT) to organize these tools and automatically decompose user requests into multiple sub-tasks and invoke corresponding music tools. The primary goal of this system is to free users from the intricacies of AI-music tools, enabling them to concentrate on the creative aspect. By granting users the freedom to effortlessly combine tools, the system offers a seamless and enriching music experience.
PDF252December 15, 2024