HASHIRU: 하이브리드 지능형 자원 활용을 위한 계층적 에이전트 시스템

초록

대규모 언어 모델(LLM)의 급속한 발전은 자율적인 다중 에이전트 시스템(MAS) 개발을 촉진하고 있습니다. 그러나 현재의 프레임워크는 유연성, 자원 인식, 모델 다양성, 자율적 도구 생성 등의 측면에서 부족함이 있습니다. 본 논문은 유연성, 자원 효율성, 적응성을 강화한 새로운 MAS 프레임워크인 HASHIRU(Hierarchical Agent System for Hybrid Intelligent Resource Utilization)를 소개합니다. HASHIRU은 작업 요구사항과 자원 제약(비용, 메모리)에 따라 인스턴스화된 전문 "직원" 에이전트를 동적으로 관리하는 "CEO" 에이전트를 특징으로 합니다. 이 하이브리드 지능은 더 작은 로컬 LLM(Ollama를 통해)을 우선적으로 사용하면서도 필요할 경우 외부 API와 더 큰 모델을 유연하게 활용합니다. 고용/해고 비용을 포함한 경제 모델은 팀 안정성과 효율적인 자원 할당을 촉진합니다. 또한, 이 시스템은 자율적인 API 도구 생성 기능과 메모리 기능을 포함하고 있습니다. 학술 논문 리뷰(58% 성공률), 안전성 평가(JailbreakBench 하위 집합에서 100% 성공률), 복잡한 추론(GSM8K: 96% 대 61%; JEEBench: 80% 대 68.3%; SVAMP: 92% 대 84%로 Gemini 2.0 Flash를 능가)과 같은 작업에 대한 평가를 통해 HASHIRU의 능력을 입증했습니다. 사례 연구는 자율적인 비용 모델 생성, 도구 통합, 예산 관리를 통한 자기 개선을 보여줍니다. HASHIRU는 동적 계층적 제어, 자원 인식 하이브리드 지능, 자율적 기능 확장을 통해 더 강력하고 효율적이며 적응 가능한 MAS를 위한 유망한 접근 방식을 제공합니다. 소스 코드와 벤치마크는 각각 https://github.com/HASHIRU-AI/HASHIRU와 https://github.com/HASHIRU-AI/HASHIRUBench에서 확인할 수 있으며, 요청 시 라이브 데모는 https://hashiruagentx-hashiruai.hf.space에서 이용 가능합니다.

English

Rapid Large Language Model (LLM) advancements are fueling autonomous Multi-Agent System (MAS) development. However, current frameworks often lack flexibility, resource awareness, model diversity, and autonomous tool creation. This paper introduces HASHIRU (Hierarchical Agent System for Hybrid Intelligent Resource Utilization), a novel MAS framework enhancing flexibility, resource efficiency, and adaptability. HASHIRU features a "CEO" agent dynamically managing specialized "employee" agents, instantiated based on task needs and resource constraints (cost, memory). Its hybrid intelligence prioritizes smaller, local LLMs (via Ollama) while flexibly using external APIs and larger models when necessary. An economic model with hiring/firing costs promotes team stability and efficient resource allocation. The system also includes autonomous API tool creation and a memory function. Evaluations on tasks like academic paper review (58% success), safety assessments (100% on a JailbreakBench subset), and complex reasoning (outperforming Gemini 2.0 Flash on GSM8K: 96% vs. 61%; JEEBench: 80% vs. 68.3%; SVAMP: 92% vs. 84%) demonstrate HASHIRU's capabilities. Case studies illustrate its self-improvement via autonomous cost model generation, tool integration, and budget management. HASHIRU offers a promising approach for more robust, efficient, and adaptable MAS through dynamic hierarchical control, resource-aware hybrid intelligence, and autonomous functional extension. Source code and benchmarks are available at https://github.com/HASHIRU-AI/HASHIRU and https://github.com/HASHIRU-AI/HASHIRUBench respectively, and a live demo is available at https://hashiruagentx-hashiruai.hf.space upon request.

HASHIRU: 하이브리드 지능형 자원 활용을 위한 계층적 에이전트 시스템

HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization

초록

Support