在大型语言模型中进行高效推断干预的非干扰参数插入：预测奖励与标记

摘要

基于Transformer的大型语言模型（LLMs）存在诸如生成不安全响应、不可靠推理等限制。现有的推断干预方法试图通过微调额外的模型来产生校准信号（如奖励），以引导LLM的解码过程，从而缓解这些问题。然而，这种解决方案由于需要额外的独立模型而引入了大量的时间和空间开销。本文提出了一种非干扰参数插入（Otter）的方法，将额外的参数插入到Transformer架构中，以预测校准信号并与原始LLM输出一起。Otter在多项具有挑战性的任务上提供了最先进的性能，同时节省高达86.5\%的额外空间和98.5\%的额外时间。此外，Otter与现有的推断引擎无缝集成，仅需要一行代码更改，而且在参数插入后，原始模型响应仍然可访问。我们的代码可在以下网址公开获取：https://github.com/chenhan97/Otter

English

Transformer-based large language models (LLMs) exhibit limitations such as generating unsafe responses, unreliable reasoning, etc. Existing inference intervention approaches attempt to mitigate these issues by finetuning additional models to produce calibration signals (such as rewards) that guide the LLM's decoding process. However, this solution introduces substantial time and space overhead due to the separate models required. This work proposes Non-disruptive parameters insertion (Otter), inserting extra parameters into the transformer architecture to predict calibration signals along with the original LLM output. Otter offers state-of-the-art performance on multiple demanding tasks while saving up to 86.5\% extra space and 98.5\% extra time. Furthermore, Otter seamlessly integrates with existing inference engines, requiring only a one-line code change, and the original model response remains accessible after the parameter insertion. Our code is publicly available at https://github.com/chenhan97/Otter

在大型语言模型中进行高效推断干预的非干扰参数插入：预测奖励与标记

Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model

摘要

Summary

Support

Support