指导调整的语言模型是更好的知识学习者。

摘要

为了使基于大型语言模型（LLM）的助手能够有效地适应不断变化的信息需求，必须能够通过持续在新数据上进行训练来更新它们的事实知识。这样做的标准方法包括在新文档上持续进行预训练，然后进行问题-答案（QA）对的指导微调。然而，我们发现使用这种方法训练的LLM在回答问题时存在困难，即使文档的困惑度被最小化。我们发现QA对通常很直接，而文档更加复杂，以错综复杂的方式将许多事实陈述编织在一起。因此，我们假设让LLM在持续预训练文档之前先接触QA对是有益的，这样从复杂文档中编码知识的过程将考虑到如何通过问题访问这些知识。基于此，我们提出了预指导微调（PIT），这是一种在训练文档之前先对问题进行指导微调的方法。这与标准的指导微调形成对比，后者是在训练文档后学习如何提取知识。大量实验证明，PIT显著增强了LLM吸收新文档知识的能力，比标准指导微调提高了17.8%。

English

In order for large language model (LLM)-based assistants to effectively adapt to evolving information needs, it must be possible to update their factual knowledge through continued training on new data. The standard recipe for doing so involves continued pre-training on new documents followed by instruction-tuning on question-answer (QA) pairs. However, we find that LLMs trained with this recipe struggle to answer questions, even though the perplexity of documents is minimized. We found that QA pairs are generally straightforward, while documents are more complex, weaving many factual statements together in an intricate manner. Therefore, we hypothesize that it is beneficial to expose LLMs to QA pairs before continued pre-training on documents so that the process of encoding knowledge from complex documents takes into account how this knowledge is accessed through questions. Based on this, we propose pre-instruction-tuning (PIT), a method that instruction-tunes on questions prior to training on documents. This contrasts with standard instruction-tuning, which learns how to extract knowledge after training on documents. Extensive experiments and ablation studies demonstrate that PIT significantly enhances the ability of LLMs to absorb knowledge from new documents, outperforming standard instruction-tuning by 17.8%.

指导调整的语言模型是更好的知识学习者。

Instruction-tuned Language Models are Better Knowledge Learners

摘要

Support