Chain-of-Dictionary Prompting Ontlokt Vertaling in Grote Taalmodellen

Samenvatting

Grote taalmmodellen (LLMs) hebben verrassend goede prestaties getoond in meertalige neurale machinaalvertaling (MNMT), zelfs wanneer ze getraind zijn zonder parallelle data. Desondanks, ondanks het feit dat de hoeveelheid trainingsdata gigantisch is, hebben ze nog steeds moeite met het vertalen van zeldzame woorden, vooral voor talen met weinig bronnen. Nog erger is dat het meestal onrealistisch is om relevante demonstraties te vinden voor in-context leren met talen met weinig bronnen op LLMs, wat het praktische gebruik van LLMs voor vertaling beperkt – hoe moeten we dit probleem aanpakken? Hiertoe presenteren we een nieuwe methode, CoD, die LLMs versterkt met voorkennis door middel van ketens van meertalige woordenboeken voor een subset van invoerwoorden om vertaalvaardigheden bij LLMs te stimuleren. Uitgebreide experimenten tonen aan dat het versterken van ChatGPT met CoD grote verbeteringen oplevert, tot wel 13x ChrF++ punten voor MNMT (3.08 tot 42.63 voor Engels naar Servisch geschreven in Cyrillisch schrift) op de volledige FLORES-200 devtest set. We demonstreren verder het belang van het ketenen van meertalige woordenboeken, evenals de superioriteit van CoD ten opzichte van few-shot demonstratie voor talen met weinig bronnen.

English

Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data. Yet, despite the fact that the amount of training data is gigantic, they still struggle with translating rare words, particularly for low-resource languages. Even worse, it is usually unrealistic to retrieve relevant demonstrations for in-context learning with low-resource languages on LLMs, which restricts the practical use of LLMs for translation -- how should we mitigate this problem? To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. Extensive experiments indicate that augmenting ChatGPT with CoD elicits large gains by up to 13x ChrF++ points for MNMT (3.08 to 42.63 for English to Serbian written in Cyrillic script) on FLORES-200 full devtest set. We further demonstrate the importance of chaining the multilingual dictionaries, as well as the superiority of CoD to few-shot demonstration for low-resource languages.

Chain-of-Dictionary Prompting Ontlokt Vertaling in Grote Taalmodellen

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

Samenvatting

Support