분할할 것인가, 정복할 것인가? LLM의 어느 부분을 증류해야 하는가?

초록

최근 연구에 따르면, 대형 언어 모델(LLMs)은 주요 과제의 하위 과제를 먼저 해결하도록 유도할 때 추론 과제를 더 잘 수행할 수 있음이 입증되었습니다. 본 논문에서는 추론 과제를 문제 분해 단계와 문제 해결 단계로 나누는 유사한 전략을 고안하고, 이 전략이 단일 단계 해결 방식보다 우수한 성능을 보일 수 있음을 입증합니다. 더 나아가, 문제 해결 단계는 대량의 도메인 지식을 요구하는 반면, 문제 분해 단계는 일반적인 문제 해결 전략을 학습하는 것만으로 충분하므로, 문제 분해 단계가 더 작은 모델로의 지식 증류에 적합할 것이라는 가설을 제시합니다. 우리는 이 두 가지 능력을 증류하는 방법을 제안하고, 이들이 추론 결과와 추론 비용에 미치는 영향을 평가합니다. 연구 결과, 문제 분해 단계를 증류함과 동시에 다양한 과제, 데이터셋, 모델 간에 좋은 일반화 성능을 달성할 수 있음을 확인했습니다. 그러나 문제 해결 능력을 증류하는 것은 성능 저하 없이 이루어지기 어려우며, 증류된 모델은 일반화에 어려움을 겪는 것으로 나타났습니다. 이러한 결과는 더 작은 크기의 증류된 문제 분해 모델을 문제 해결 LLMs와 결합하여 사용함으로써, 비용 효율적인 추론과 지역적 적응을 달성할 수 있음을 시사합니다.

English

Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothesize that the decomposition should be easier to distill into a smaller model compared to the problem solving because the latter requires large amounts of domain knowledge while the former only requires learning general problem solving strategies. We propose methods to distill these two capabilities and evaluate their impact on reasoning outcomes and inference cost. We find that we can distill the problem decomposition phase and at the same time achieve good generalization across tasks, datasets, and models. However, it is harder to distill the problem solving capability without losing performance and the resulting distilled model struggles with generalization. These results indicate that by using smaller, distilled problem decomposition models in combination with problem solving LLMs we can achieve reasoning with cost-efficient inference and local adaptation.

분할할 것인가, 정복할 것인가? LLM의 어느 부분을 증류해야 하는가?

Divide-or-Conquer? Which Part Should You Distill Your LLM?

초록

Support