테스트 타임 디퓨전을 활용한 딥 리서처

초록

대형 언어 모델(LLMs)로 구동되는 딥 리서치 에이전트는 빠르게 발전하고 있지만, 일반적인 테스트 시간 스케일링 알고리즘을 사용하여 복잡하고 장편의 연구 보고서를 생성할 때는 성능이 정체되는 경우가 많습니다. 인간의 연구 과정이 탐색, 추론, 수정의 반복적인 사이클로 이루어진다는 점에서 영감을 받아, 우리는 테스트 시간 확산 딥 리서처(TTD-DR)를 제안합니다. 이 새로운 프레임워크는 연구 보고서 생성을 확산 과정으로 개념화합니다. TTD-DR은 이 과정을 초안으로 시작하는데, 이 초안은 업데이트 가능한 골격으로서 연구 방향을 안내하는 진화하는 기초 역할을 합니다. 그런 다음 이 초안은 각 단계에서 외부 정보를 통합하는 검색 메커니즘에 의해 동적으로 정보가 제공되는 "디노이징" 과정을 통해 반복적으로 개선됩니다. 이 핵심 과정은 에이전트 워크플로우의 각 구성 요소에 적용되는 자기 진화 알고리즘에 의해 더욱 강화되어, 확산 과정을 위한 고품질의 문맥 생성을 보장합니다. 이 초안 중심의 설계는 보고서 작성 과정을 더욱 시기적절하고 일관성 있게 만들면서, 반복적인 탐색 과정에서의 정보 손실을 줄입니다. 우리는 TTD-DR이 집중적인 탐색과 다중 홉 추론이 필요한 다양한 벤치마크에서 최첨단 결과를 달성하며, 기존의 딥 리서치 에이전트를 크게 능가함을 입증합니다.

English

Deep research agents, powered by Large Language Models (LLMs), are rapidly advancing; yet, their performance often plateaus when generating complex, long-form research reports using generic test-time scaling algorithms. Drawing inspiration from the iterative nature of human research, which involves cycles of searching, reasoning, and revision, we propose the Test-Time Diffusion Deep Researcher (TTD-DR). This novel framework conceptualizes research report generation as a diffusion process. TTD-DR initiates this process with a preliminary draft, an updatable skeleton that serves as an evolving foundation to guide the research direction. The draft is then iteratively refined through a "denoising" process, which is dynamically informed by a retrieval mechanism that incorporates external information at each step. The core process is further enhanced by a self-evolutionary algorithm applied to each component of the agentic workflow, ensuring the generation of high-quality context for the diffusion process. This draft-centric design makes the report writing process more timely and coherent while reducing information loss during the iterative search process. We demonstrate that our TTD-DR achieves state-of-the-art results on a wide array of benchmarks that require intensive search and multi-hop reasoning, significantly outperforming existing deep research agents.

테스트 타임 디퓨전을 활용한 딥 리서처

Deep Researcher with Test-Time Diffusion

초록

Support