TREX: 에이전트 기반 트리 탐색을 통한 LLM 미세 조정 자동화

초록

대규모 언어 모델(LLM)이 AI 연구 에이전트가 개별 과학적 과제를 수행할 수 있도록 지원하고 있지만, LLM 훈련과 같은 복잡한 실제 업무 흐름의 자동화는 여전히 큰 과제로 남아 있습니다. 본 논문에서는 LLM 훈련 전 주기를 자동화하는 다중 에이전트 시스템인 TREX를 소개합니다. 연구자(Researcher)와 실행자(Executor)라는 두 가지 핵심 모듈 간의 협업을 조정함으로써, 본 시스템은 요구 사항 분석, 오픈 도메인 문헌 및 데이터 연구, 훈련 전략 수립, 데이터 레시피 준비, 모델 훈련 및 평가를 원활하게 수행합니다. 다중 실험 프로세스는 탐색 트리로 모델링되어, 시스템이 탐색 경로를 효율적으로 계획하고 역사적 결과를 재사용하며 반복적 시행으로부터 높은 수준의 통찰력을 도출할 수 있게 합니다. 자동화된 LLM 훈련 능력을 평가하기 위해, 기본 모델 능력 최적화부터 도메인 특화 과제 성능 향상에 이르기까지 실제 시나리오에서 도출된 10개 과제로 구성된 FT-Bench 벤치마크를 구축했습니다. 실험 결과는 TREX 에이전트가 목표 과제에서 모델 성능을 지속적으로 최적화함을 입증합니다.

English

While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle. By orchestrating collaboration between two core modules-the Researcher and the Executor-the system seamlessly performs requirement analysis, open-domain literature and data research, formulation of training strategies, preparation of data recipes, and model training and evaluation. The multi-round experimental process is modeled as a search tree, enabling the system to efficiently plan exploration paths, reuse historical results, and distill high-level insights from iterative trials. To evaluate the capability of automated LLM training, we construct FT-Bench, a benchmark comprising 10 tasks derived from real-world scenarios, ranging from optimizing fundamental model capabilities to enhancing performance on domain-specific tasks. Experimental results demonstrate that the TREX agent consistently optimizes model performance on target tasks.

TREX: 에이전트 기반 트리 탐색을 통한 LLM 미세 조정 자동화

TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

초록

Support