메타러닝을 활용한 시스템 프롬프트 최적화

초록

대규모 언어 모델(LLMs)은 놀라운 성능을 보여주며, 이들의 성능을 극대화하기 위해 입력 프롬프트 최적화가 중요한 역할을 하고 있다. 그러나 LLM 프롬프트는 작업에 무관한 시스템 프롬프트와 작업별 사용자 프롬프트로 구성되어 있음에도 불구하고, 기존의 프롬프트 최적화 연구는 개별 쿼리나 작업에 특화된 사용자 프롬프트에 초점을 맞추어 왔으며, 한 번 최적화되면 다양한 작업과 도메인에 적용 가능한 시스템 프롬프트는 크게 간과되어 왔다. 이에 동기를 부여받아, 우리는 다양한 사용자 프롬프트에 대해 강건하고 새로운 작업에도 전이 가능한 시스템 프롬프트를 설계하는 것을 목표로 하는 이중 수준 시스템 프롬프트 최적화라는 새로운 문제를 제안한다. 이 문제를 해결하기 위해, 우리는 메타러닝 프레임워크를 제안한다. 이 프레임워크는 여러 데이터셋에 걸쳐 다양한 사용자 프롬프트를 대상으로 시스템 프롬프트를 메타러닝하며, 동시에 사용자 프롬프트를 반복적으로 업데이트하여 이들 간의 시너지를 보장한다. 우리는 5개의 서로 다른 도메인에 걸친 14개의 새로운 데이터셋에 대해 실험을 수행하였으며, 우리의 접근법이 다양한 사용자 프롬프트에 효과적으로 일반화되는 시스템 프롬프트를 생성함을 보여준다. 또한, 최적화된 시스템 프롬프트는 새로운 작업에도 빠르게 적응할 수 있으며, 테스트 시 사용자 프롬프트에 대해 더 적은 최적화 단계로도 향상된 성능을 달성할 수 있음을 발견하였다.

English

Large Language Models (LLMs) have shown remarkable capabilities, with optimizing their input prompts playing a pivotal role in maximizing their performance. However, while LLM prompts consist of both the task-agnostic system prompts and task-specific user prompts, existing work on prompt optimization has focused on user prompts specific to individual queries or tasks, and largely overlooked the system prompt that is, once optimized, applicable across different tasks and domains. Motivated by this, we introduce the novel problem of bilevel system prompt optimization, whose objective is to design system prompts that are robust to diverse user prompts and transferable to unseen tasks. To tackle this problem, we then propose a meta-learning framework, which meta-learns the system prompt by optimizing it over various user prompts across multiple datasets, while simultaneously updating the user prompts in an iterative manner to ensure synergy between them. We conduct experiments on 14 unseen datasets spanning 5 different domains, on which we show that our approach produces system prompts that generalize effectively to diverse user prompts. Also, our findings reveal that the optimized system prompt enables rapid adaptation even to unseen tasks, requiring fewer optimization steps for test-time user prompts while achieving improved performance.

메타러닝을 활용한 시스템 프롬프트 최적화

System Prompt Optimization with Meta-Learning

초록

Support