궤적 지원 LLM 추론 해독: 최적화 관점에서의 접근

초록

우리는 메타러닝의 관점에서 대규모 언어 모델(LLM)의 추론 능력을 이해하기 위한 새로운 프레임워크를 제안한다. 추론 궤적을 LLM의 매개변수에 대한 의사 경사 하강법 업데이트로 개념화함으로써, LLM 추론과 다양한 메타러닝 패러다임 간의 유사성을 확인한다. 우리는 추론 과제의 학습 과정을 메타러닝 설정으로 공식화하며, 각 질문을 개별 과제로 간주하고 추론 궤적을 모델 매개변수를 적응시키기 위한 내부 루프 최적화로 활용한다. 다양한 질문 집합에 대해 학습을 마치면, LLM은 이전에 보지 못한 질문에도 일반화할 수 있는 기본적인 추론 능력을 개발한다. 광범위한 실험적 평가를 통해 LLM 추론과 메타러닝 간의 강력한 연결을 입증하며, 메타러닝 관점에서 중요한 여러 문제를 탐구한다. 우리의 연구는 LLM 추론에 대한 이해를 향상시킬 뿐만 아니라, 확립된 메타러닝 기법을 통해 이러한 모델을 개선하기 위한 실용적인 통찰을 제공한다.

English

We propose a novel framework for comprehending the reasoning capabilities of large language models (LLMs) through the perspective of meta-learning. By conceptualizing reasoning trajectories as pseudo-gradient descent updates to the LLM's parameters, we identify parallels between LLM reasoning and various meta-learning paradigms. We formalize the training process for reasoning tasks as a meta-learning setup, with each question treated as an individual task, and reasoning trajectories serving as the inner loop optimization for adapting model parameters. Once trained on a diverse set of questions, the LLM develops fundamental reasoning capabilities that can generalize to previously unseen questions. Extensive empirical evaluations substantiate the strong connection between LLM reasoning and meta-learning, exploring several issues of significant interest from a meta-learning standpoint. Our work not only enhances the understanding of LLM reasoning but also provides practical insights for improving these models through established meta-learning techniques.

궤적 지원 LLM 추론 해독: 최적화 관점에서의 접근

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective

초록

Support