예측과 학습: 능동적 에이전트에서 유휴 시간 컴퓨팅 활용

초록

AI 에이전트가 추론 및 도구 사용에서 뛰어난 능력을 보여주지만, 이들은 근본적으로 반응적입니다. 즉, 명시적인 사용자 프롬프트가 있을 때만 응답을 계산합니다. 이러한 패러다임은 중요한 기회를 간과합니다. 상호작용 사이의 유휴 시간이 대부분 낭비되어, 에이전트가 향후 사용자 요구를 준비할 수 없게 됩니다. 이 격차를 해소하기 위해, 우리는 유휴 시간 연산을 활용하여 발생 가능한 향후 사용자 요구를 예측하고 충족시키는 선제적 에이전트 아키텍처인 ProAct를 제안합니다. ProAct는 진화하는 대화 이력과 지속적 메모리를 함께 분석하여 다가올 요구를 예측하고 반복적으로 정보를 획득함으로써, 사용자가 질의를 시작하기 전에 에이전트가 지식 격차를 해소하고 증거를 준비할 수 있도록 합니다. 선제적 능력을 엄격하게 평가하기 위해, 우리는 예측 가능한 필요 체인과 다양한 사용자 인지 프로파일을 특징으로 하는 40개 도메인에 걸친 200개 시나리오로 구성된 포괄적 벤치마크인 ProActEval도 도입합니다. 실증 결과는 반응적 기준선 대비 상당한 이점을 보여줍니다. ProAct는 ProActEval에서 필요한 턴 수를 14.8% 줄여 작업 완료를 가속화하고, 사용자 노력을 11.7% 감소시키며, 환각률을 28.1% 낮춥니다. 또한, MemBench 평가는 ProAct가 최첨단 반성적 정확도를 달성하여 지속적이고 강력한 성능을 입증함을 확인합니다.

English

While AI agents demonstrate remarkable capabilities in reasoning and tool use, they remain fundamentally reactive: they compute responses only after explicit user prompts. This paradigm ignores a critical opportunity: the idle time between interactions is largely wasted, leaving agents unable to prepare for future user needs. To bridge this gap, we introduce ProAct, a proactive agent architecture that leverages idle-time compute to anticipate and fulfill likely upcoming user needs. By analyzing evolving dialogue history together with persistent memory, ProAct predicts upcoming needs and iteratively acquires information, allowing the agent to resolve knowledge gaps and prepare evidence before the user initiates a query.To rigorously evaluate proactive capabilities, we also introduce ProActEval, a comprehensive benchmark comprising 200 scenarios across 40 domains, featuring predictable need chains and diverse user cognitive profiles. Empirical results demonstrate significant advantages over reactive baselines. ProAct accelerates task completion by reducing required turns by 14.8%, decreases user effort by 11.7%, and cuts hallucination rates by 28.1% on ProActEval. Furthermore, MemBench evaluations confirm that ProAct achieves state-of-the-art reflective accuracy, underscoring its sustained and robust performance.