HiMAP-Travel: 제약 조건이 있는 장기적 여행 계획을 위한 계층적 다중 에이전트 플래닝

초록

기존 순차적 LLM 에이전트는 예산 및 다양성 요구사항과 같은 엄격한 제약 조건이 있는 장기 계획 수립에 어려움을 겪습니다. 계획이 진행되고 컨텍스트가 증가함에 따라 이러한 에이전트는 전역 제약 조건에서 이탈하는 경향이 있습니다. 본 연구에서는 계획 수립을 전략적 조정과 병렬적인 일별 실행으로 분할하는 계층적 다중 에이전트 프레임워크인 HiMAP-Travel을 제안합니다. 코디네이터(Coordinator)는 일별 자원을 할당하고, 일별 실행기(Day Executor)는 병렬적으로 독립적으로 계획을 수립합니다. 이를 가능하게 하는 세 가지 핵심 메커니즘이 있습니다: 병렬 에이전트 간에 예산과 고유성 제약 조건을 강제하는 트랜잭션 모니터(transactional monitor), 실행 불가능한 하위 목표를 거부하고 재계획을 촉발할 수 있도록 하는 협상 프로토콜(bargaining protocol), 그리고 역할 조건화(role conditioning)를 통해 모든 에이전트를 구동하는 GRPO로 훈련된 단일 정책(single policy)입니다. TravelPlanner에서 Qwen3-8B를 탑재한 HiMAP-Travel은 52.78%의 검증 및 52.65%의 테스트 최종 통과율(Final Pass Rate, FPR)을 달성했습니다. 동일한 모델, 훈련, 도구를 사용한 통제 비교에서 기존 순차적 DeepTravel 기준선을 +8.67%p 능가했습니다. 또한 ATLAS를 +17.65%p, MTP를 +10.0%p 앞섰습니다. FlexTravelBench 다중 턴 시나리오에서는 병렬화를 통해 지연 시간을 2.5배 줄이면서 44.34%(2턴) 및 37.42%(3턴)의 FPR을 달성했습니다.

English

Sequential LLM agents fail on long-horizon planning with hard constraints like budgets and diversity requirements. As planning progresses and context grows, these agents drift from global constraints. We propose HiMAP-Travel, a hierarchical multi-agent framework that splits planning into strategic coordination and parallel day-level execution. A Coordinator allocates resources across days, while Day Executors plan independently in parallel. Three key mechanisms enable this: a transactional monitor enforcing budget and uniqueness constraints across parallel agents, a bargaining protocol allowing agents to reject infeasible sub-goals and trigger re-planning, and a single policy trained with GRPO that powers all agents through role conditioning. On TravelPlanner, HiMAP-Travel with Qwen3-8B achieves 52.78% validation and 52.65% test Final Pass Rate (FPR). In a controlled comparison with identical model, training, and tools, it outperforms the sequential DeepTravel baseline by +8.67~pp. It also surpasses ATLAS by +17.65~pp and MTP by +10.0~pp. On FlexTravelBench multi-turn scenarios, it achieves 44.34% (2-turn) and 37.42% (3-turn) FPR while reducing latency 2.5x through parallelization.

HiMAP-Travel: 제약 조건이 있는 장기적 여행 계획을 위한 계층적 다중 에이전트 플래닝

HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

초록

Support