HASHIRU：混合智能资源利用的层次化代理系统

摘要

大型語言模型（LLM）的快速發展正推動著自主多代理系統（MAS）的進步。然而，現有框架往往缺乏靈活性、資源意識、模型多樣性及自主工具創建能力。本文介紹了HASHIRU（分層代理系統用於混合智能資源利用），這是一種新型MAS框架，旨在提升靈活性、資源效率及適應性。HASHIRU配備了一個“CEO”代理，動態管理基於任務需求與資源限制（成本、內存）實例化的專業“員工”代理。其混合智能優先使用較小、本地的LLM（通過Ollama），並在必要時靈活調用外部API及更大模型。引入包含招聘/解僱成本的經濟模型，促進了團隊穩定性和資源高效分配。該系統還具備自主API工具創建功能及記憶機制。在學術論文審查（成功率58%）、安全評估（JailbreakBench子集上100%通過）及複雜推理（在GSM8K上超越Gemini 2.0 Flash：96%對61%；JEEBench：80%對68.3%；SVAMP：92%對84%）等任務上的評估，展示了HASHIRU的強大能力。案例研究揭示了其通過自主成本模型生成、工具集成及預算管理實現自我提升的過程。HASHIRU通過動態分層控制、資源感知的混合智能及自主功能擴展，為構建更為堅固、高效且適應性強的MAS提供了有前景的解決方案。源代碼與基準測試分別可在https://github.com/HASHIRU-AI/HASHIRU和https://github.com/HASHIRU-AI/HASHIRUBench獲取，並可根據請求在https://hashiruagentx-hashiruai.hf.space訪問實時演示。

English

Rapid Large Language Model (LLM) advancements are fueling autonomous Multi-Agent System (MAS) development. However, current frameworks often lack flexibility, resource awareness, model diversity, and autonomous tool creation. This paper introduces HASHIRU (Hierarchical Agent System for Hybrid Intelligent Resource Utilization), a novel MAS framework enhancing flexibility, resource efficiency, and adaptability. HASHIRU features a "CEO" agent dynamically managing specialized "employee" agents, instantiated based on task needs and resource constraints (cost, memory). Its hybrid intelligence prioritizes smaller, local LLMs (via Ollama) while flexibly using external APIs and larger models when necessary. An economic model with hiring/firing costs promotes team stability and efficient resource allocation. The system also includes autonomous API tool creation and a memory function. Evaluations on tasks like academic paper review (58% success), safety assessments (100% on a JailbreakBench subset), and complex reasoning (outperforming Gemini 2.0 Flash on GSM8K: 96% vs. 61%; JEEBench: 80% vs. 68.3%; SVAMP: 92% vs. 84%) demonstrate HASHIRU's capabilities. Case studies illustrate its self-improvement via autonomous cost model generation, tool integration, and budget management. HASHIRU offers a promising approach for more robust, efficient, and adaptable MAS through dynamic hierarchical control, resource-aware hybrid intelligence, and autonomous functional extension. Source code and benchmarks are available at https://github.com/HASHIRU-AI/HASHIRU and https://github.com/HASHIRU-AI/HASHIRUBench respectively, and a live demo is available at https://hashiruagentx-hashiruai.hf.space upon request.

HASHIRU：混合智能资源利用的层次化代理系统

HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization

摘要

Support