指导调整的语言模型是更好的知识学习者。
Instruction-tuned Language Models are Better Knowledge Learners
February 20, 2024
作者: Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodriguez, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srinivasan Iyer
cs.AI
摘要
为了使基于大型语言模型(LLM)的助手能够有效地适应不断变化的信息需求,必须能够通过持续在新数据上进行训练来更新它们的事实知识。这样做的标准方法包括在新文档上持续进行预训练,然后进行问题-答案(QA)对的指导微调。然而,我们发现使用这种方法训练的LLM在回答问题时存在困难,即使文档的困惑度被最小化。我们发现QA对通常很直接,而文档更加复杂,以错综复杂的方式将许多事实陈述编织在一起。因此,我们假设让LLM在持续预训练文档之前先接触QA对是有益的,这样从复杂文档中编码知识的过程将考虑到如何通过问题访问这些知识。基于此,我们提出了预指导微调(PIT),这是一种在训练文档之前先对问题进行指导微调的方法。这与标准的指导微调形成对比,后者是在训练文档后学习如何提取知识。大量实验证明,PIT显著增强了LLM吸收新文档知识的能力,比标准指导微调提高了17.8%。
English
In order for large language model (LLM)-based assistants to effectively adapt
to evolving information needs, it must be possible to update their factual
knowledge through continued training on new data. The standard recipe for doing
so involves continued pre-training on new documents followed by
instruction-tuning on question-answer (QA) pairs. However, we find that LLMs
trained with this recipe struggle to answer questions, even though the
perplexity of documents is minimized. We found that QA pairs are generally
straightforward, while documents are more complex, weaving many factual
statements together in an intricate manner. Therefore, we hypothesize that it
is beneficial to expose LLMs to QA pairs before continued pre-training on
documents so that the process of encoding knowledge from complex documents
takes into account how this knowledge is accessed through questions. Based on
this, we propose pre-instruction-tuning (PIT), a method that instruction-tunes
on questions prior to training on documents. This contrasts with standard
instruction-tuning, which learns how to extract knowledge after training on
documents. Extensive experiments and ablation studies demonstrate that PIT
significantly enhances the ability of LLMs to absorb knowledge from new
documents, outperforming standard instruction-tuning by 17.8%.