GPT의 번역에 대한 인-컨텍스트 학습 분석

초록

최근 GPT-3와 같은 대형 언어 모델(LLMs)을 기계 번역(MT)에 활용하는 연구는 주로 프롬프트를 위한 소수의 샘플을 선택하는 데 초점을 맞추어 왔습니다. 본 연구에서는 고품질의 도메인 내 데모를 변형시켜 번역의 문맥 학습에서 데모 속성의 역할을 더 잘 이해하려고 합니다. 우리는 소스-타겟 매핑의 비대칭적 변형이 매우 다른 결과를 가져온다는 것을 발견했습니다. 소스 측의 변형은 놀랍게도 거의 영향을 미치지 않는 반면, 타겟 측의 변형은 번역 품질을 크게 저하시킬 수 있으며, 이는 문맥 학습 중에 출력 텍스트 분포가 가장 중요한 학습 신호를 제공한다는 것을 시사합니다. 우리는 이러한 신호를 제로샷 프롬프팅에 자동으로 추가하는 Zero-Shot-Context라는 방법을 제안합니다. 이 방법이 GPT-3의 제로샷 번역 성능을 향상시키고, 심지어 소수 샘플 프롬프팅을 사용한 번역과도 경쟁력을 갖출 수 있음을 보여줍니다.

English

Most of the recent work in leveraging Large Language Models (LLMs) such as GPT-3 for Machine Translation (MT) has focused on selecting the few-shot samples for prompting. In this work, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations. We find that asymmetric perturbation of the source-target mappings yield vastly different results. We show that the perturbation of the source side has surprisingly little impact, while target perturbation can drastically reduce translation quality, suggesting that it is the output text distribution that provides the most important learning signal during in-context learning of translations. We propose a method named Zero-Shot-Context to add this signal automatically in Zero-Shot prompting. We demonstrate that it improves upon the zero-shot translation performance of GPT-3, even making it competitive with few-shot prompted translations.

GPT의 번역에 대한 인-컨텍스트 학습 분석

Dissecting In-Context Learning of Translations in GPTs

초록

Support