HASHIRU:混合智能资源利用的层次化代理系统
HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
June 1, 2025
作者: Kunal Pai, Parth Shah, Harshil Patel
cs.AI
摘要
大型語言模型(LLM)的快速發展正推動著自主多代理系統(MAS)的進步。然而,現有框架往往缺乏靈活性、資源意識、模型多樣性及自主工具創建能力。本文介紹了HASHIRU(分層代理系統用於混合智能資源利用),這是一種新型MAS框架,旨在提升靈活性、資源效率及適應性。HASHIRU配備了一個“CEO”代理,動態管理基於任務需求與資源限制(成本、內存)實例化的專業“員工”代理。其混合智能優先使用較小、本地的LLM(通過Ollama),並在必要時靈活調用外部API及更大模型。引入包含招聘/解僱成本的經濟模型,促進了團隊穩定性和資源高效分配。該系統還具備自主API工具創建功能及記憶機制。在學術論文審查(成功率58%)、安全評估(JailbreakBench子集上100%通過)及複雜推理(在GSM8K上超越Gemini 2.0 Flash:96%對61%;JEEBench:80%對68.3%;SVAMP:92%對84%)等任務上的評估,展示了HASHIRU的強大能力。案例研究揭示了其通過自主成本模型生成、工具集成及預算管理實現自我提升的過程。HASHIRU通過動態分層控制、資源感知的混合智能及自主功能擴展,為構建更為堅固、高效且適應性強的MAS提供了有前景的解決方案。源代碼與基準測試分別可在https://github.com/HASHIRU-AI/HASHIRU和https://github.com/HASHIRU-AI/HASHIRUBench獲取,並可根據請求在https://hashiruagentx-hashiruai.hf.space訪問實時演示。
English
Rapid Large Language Model (LLM) advancements are fueling autonomous
Multi-Agent System (MAS) development. However, current frameworks often lack
flexibility, resource awareness, model diversity, and autonomous tool creation.
This paper introduces HASHIRU (Hierarchical Agent System for Hybrid Intelligent
Resource Utilization), a novel MAS framework enhancing flexibility, resource
efficiency, and adaptability. HASHIRU features a "CEO" agent dynamically
managing specialized "employee" agents, instantiated based on task needs and
resource constraints (cost, memory). Its hybrid intelligence prioritizes
smaller, local LLMs (via Ollama) while flexibly using external APIs and larger
models when necessary. An economic model with hiring/firing costs promotes team
stability and efficient resource allocation. The system also includes
autonomous API tool creation and a memory function. Evaluations on tasks like
academic paper review (58% success), safety assessments (100% on a
JailbreakBench subset), and complex reasoning (outperforming Gemini 2.0 Flash
on GSM8K: 96% vs. 61%; JEEBench: 80% vs. 68.3%; SVAMP: 92% vs. 84%) demonstrate
HASHIRU's capabilities. Case studies illustrate its self-improvement via
autonomous cost model generation, tool integration, and budget management.
HASHIRU offers a promising approach for more robust, efficient, and adaptable
MAS through dynamic hierarchical control, resource-aware hybrid intelligence,
and autonomous functional extension. Source code and benchmarks are available
at https://github.com/HASHIRU-AI/HASHIRU and
https://github.com/HASHIRU-AI/HASHIRUBench respectively, and a live demo is
available at https://hashiruagentx-hashiruai.hf.space upon request.