분산 블랙박스 합의 최적화를 위한 행동 및 협력 학습

초록

분산 블랙박스 합의 최적화는 다중 에이전트 시스템에서 에이전트들이 지역적 목적 함수 질의와 제한된 이웃 통신만을 사용하여 전역 목적 함수를 개선해야 하는 근본적인 문제입니다. 기존 방법은 주로 수작업으로 설계된 업데이트 규칙과 정적 협력 패턴에 의존하는데, 이는 이질적 비볼록 환경에서 지역 적응, 전역 조정, 통신 효율성 간의 균형을 맞추는 데 어려움을 겪는 경우가 많습니다. 본 논문에서는 분산 블랙박스 합의 최적화를 위한 궤적 기반 자가 설계로의 초기 단계를 제시합니다. 먼저, 분산 합의 설정에 맞춰 조정된 적응형 내부 메커니즘으로 에이전트 수준의 군집 역학을 재설계하여 탐색, 수렴, 지역 탈출 간의 균형을 개선합니다. 이 적응형 실행 계층 위에, 대규모 언어 모델이 역사적 최적화 궤적으로부터 에이전트 내부 행동 동작과 에이전트 외부 협력 패턴 형성을 위한 희소한 고수준 지도를 제공하는 궤적 기반 프레임워크인 LACMAS(Learning to Act and Cooperate)를 제안합니다. 더 나아가 자원 인식 방식으로 다양한 형태의 적응을 활성화하기 위한 단계적 인지 스케줄링 전략을 도입합니다. 표준 분산 블랙박스 벤치마크와 실제 분산 작업에 대한 실험 결과, LACMAS가 강력한 베이스라인 대비 해의 질, 수렴 효율성, 통신 효율성을 지속적으로 개선함을 보여주며, 이는 수작업 분산 조정에서 자가 설계 다중 에이전트 최적화 시스템으로 나아가는 실용적인 경로를 시사합니다.

English

Distributed blackbox consensus optimization is a fundamental problem in multi-agent systems, where agents must improve a global objective using only local objective queries and limited neighbor communication. Existing methods largely rely on handcrafted update rules and static cooperation patterns, which often struggle to balance local adaptation, global coordination, and communication efficiency in heterogeneous nonconvex environments. In this paper, we take an initial step toward trajectory-driven self-design for distributed black-box consensus optimization. We first redesign the agent-level swarm dynamics with an adaptive internal mechanism tailored to decentralized consensus settings, improving the balance between exploration, convergence, and local escape. Built on top of this adaptive execution layer, we propose Learning to Act and Cooperate (LACMAS), a trajectorydriven framework in which large language models provide sparse highlevel guidance for shaping both agentinternal action behaviors and agentexternal cooperation patterns from historical optimization trajectories. We further introduce a phased cognitive scheduling strategy to activate different forms of adaptation in a resource-aware manner. Experiments on standard distributed black-box benchmarks and real-world distributed tasks show that LAC-MAS consistently improves solution quality, convergence efficiency, and communication efficiency over strong baselines, suggesting a practical route from handcrafted distributed coordination toward self-designing multi-agent optimization systems.

분산 블랙박스 합의 최적화를 위한 행동 및 협력 학습

Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization

초록

Support