자신의 데이터를 가져오세요! 대규모 언어 모델을 위한 자기 지도 평가

초록

대규모 언어 모델(LLM)의 부상과 다양한 분야에서의 광범위한 활용으로 인해, 현실적인 데이터에 대한 언어 모델의 행동을 측정하는 것이 필수적입니다. 예를 들어, 고객 대면 챗봇을 배포하는 기업은 모델이 고객 요청에 욕설로 응답하지 않도록 해야 합니다. 현재의 평가 방법은 인간이 직접 레이블을 지정한 소규모의 도메인 특화 데이터셋을 사용하여 이 문제에 접근합니다. 이러한 평가 데이터셋은 종종 좁고 단순화된 분포에서 샘플링되며, 데이터 소스가 훈련 세트에 의도치 않게 유출되어 오해의 소지가 있는 평가 결과를 초래할 수 있습니다. 이러한 단점을 극복하기 위해, 우리는 입력 텍스트에 대한 변환에 대한 민감도 또는 불변성을 분석함으로써 LLM을 자가 지도 방식으로 평가하는 프레임워크를 제안합니다. 자가 지도 평가는 야생에서 수집된 데이터셋이나 실시간 모델 배포 중에 스트리밍되는 데이터에 대해 LLM의 행동을 직접 모니터링할 수 있습니다. 우리는 폐쇄형 지식, 독성, 장거리 문맥 의존성뿐만 아니라 문법 구조와 토큰화 오류에 대한 민감도를 측정하기 위한 자가 지도 평가 전략을 보여줍니다. 유사한 인간 레이블 벤치마크와의 비교가 가능한 경우, 자가 지도 평가와 인간 지도 평가 간에 강한 상관관계가 있음을 발견했습니다. 자가 지도 패러다임은 레이블 데이터에 의존하는 현재의 평가 전략을 보완합니다.

English

With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that the model will not respond to client requests with profanity. Current evaluations approach this problem using small, domain-specific datasets with human-curated labels. These evaluation sets are often sampled from a narrow and simplified distribution, and data sources can unknowingly be leaked into the training set which can lead to misleading evaluations. To bypass these drawbacks, we propose a framework for self-supervised evaluation of LLMs by analyzing their sensitivity or invariance to transformations on the input text. Self-supervised evaluation can directly monitor LLM behavior on datasets collected in the wild or streamed during live model deployment. We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence, in addition to sensitivity to grammatical structure and tokenization errors. When comparisons to similar human-labeled benchmarks are available, we find strong correlations between self-supervised and human-supervised evaluations. The self-supervised paradigm complements current evaluation strategies that rely on labeled data.

자신의 데이터를 가져오세요! 대규모 언어 모델을 위한 자기 지도 평가

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

초록

Support