Os GPTs produzem traduções menos literais?

Resumo

Modelos de Linguagem de Grande Escala (LLMs), como o GPT-3, surgiram como modelos de linguagem de propósito geral capazes de abordar diversas tarefas de geração ou compreensão de linguagem natural. No contexto de Tradução Automática (MT), vários trabalhos investigaram mecanismos de "few-shot prompting" para eliciar traduções melhores a partir de LLMs. No entanto, houve relativamente pouca investigação sobre como essas traduções diferem qualitativamente das geradas por modelos padrão de Tradução Automática Neural (NMT). Neste trabalho, investigamos essas diferenças em termos da literalidade das traduções produzidas pelos dois sistemas. Utilizando medidas de literalidade que envolvem alinhamento de palavras e monotonicidade, descobrimos que as traduções do inglês para outras línguas (E-X) geradas pelos GPTs tendem a ser menos literais, ao mesmo tempo que exibem pontuações similares ou melhores em métricas de qualidade de MT. Demonstramos que essa descoberta também é corroborada em avaliações humanas. Em seguida, mostramos que essas diferenças são especialmente pronunciadas ao traduzir sentenças que contêm expressões idiomáticas.

English

Large Language Models (LLMs) such as GPT-3 have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks. On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs. However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. Using literalness measures involving word alignment and monotonicity, we find that translations out of English (E-X) from GPTs tend to be less literal, while exhibiting similar or better scores on MT quality metrics. We demonstrate that this finding is borne out in human evaluations as well. We then show that these differences are especially pronounced when translating sentences that contain idiomatic expressions.

Os GPTs produzem traduções menos literais?

Do GPTs Produce Less Literal Translations?

Resumo

Support