REINA：基于正则化熵信息的高效同步语音翻译损失函数

摘要

同步语音翻译（SimulST）系统在接收音频流的同时，实时输出翻译文本或语音。这类系统面临的一大挑战是如何在翻译质量与延迟之间取得平衡。为此，我们提出了一种优化策略：仅在等待更多输入能带来信息增益时，才延迟输出。基于这一策略，我们引入了正则化熵信息适应（REINA），这是一种利用现有非流式翻译模型训练自适应策略的新颖损失函数。REINA源自信息论原理，其应用显著提升了先前工作中关于延迟与质量权衡的帕累托前沿。通过REINA，我们训练了法语、西班牙语和德语与英语之间的双向SimulST模型。仅使用开源或合成生成的数据进行训练，我们实现了与同类规模模型相比的当前最优（SOTA）流式翻译效果。此外，我们还引入了一种流式效率度量指标，定量分析表明，相较于之前的方法，REINA在延迟与质量权衡上的改进幅度高达21%，这一结果已针对非流式基线BLEU得分进行了标准化处理。

English

Simultaneous Speech Translation (SimulST) systems stream in audio while simultaneously emitting translated text or speech. Such systems face the significant challenge of balancing translation quality and latency. We introduce a strategy to optimize this tradeoff: wait for more input only if you gain information by doing so. Based on this strategy, we present Regularized Entropy INformation Adaptation (REINA), a novel loss to train an adaptive policy using an existing non-streaming translation model. We derive REINA from information theory principles and show that REINA helps push the reported Pareto frontier of the latency/quality tradeoff over prior works. Utilizing REINA, we train a SimulST model on French, Spanish and German, both from and into English. Training on only open source or synthetically generated data, we achieve state-of-the-art (SOTA) streaming results for models of comparable size. We also introduce a metric for streaming efficiency, quantitatively showing REINA improves the latency/quality trade-off by as much as 21% compared to prior approaches, normalized against non-streaming baseline BLEU scores.

REINA：基于正则化熵信息的高效同步语音翻译损失函数

REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation

摘要

Support