ChatQA：打造 GPT-4 等級的對話式問答模型

摘要

在這份研究中，我們介紹了ChatQA，這是一系列的對話式問答（QA）模型，能夠達到GPT-4的準確度水平。具體來說，我們提出了一種兩階段的指導調整方法，可以顯著改善從大型語言模型（LLMs）中獲得的零-shot對話式QA結果。為了處理對話式QA中的檢索，我們在多輪QA數據集上對密集檢索器進行微調，這提供了與使用最先進的查詢重寫模型相當的結果，同時大幅降低了部署成本。值得注意的是，我們的ChatQA-70B在10個對話式QA數據集的平均分數方面可以優於GPT-4（54.14比53.90），而且不依賴於OpenAI GPT模型的任何合成數據。

English

In this work, we introduce ChatQA, a family of conversational question answering (QA) models, that obtain GPT-4 level accuracies. Specifically, we propose a two-stage instruction tuning method that can significantly improve the zero-shot conversational QA results from large language models (LLMs). To handle retrieval in conversational QA, we fine-tune a dense retriever on a multi-turn QA dataset, which provides comparable results to using the state-of-the-art query rewriting model while largely reducing deployment cost. Notably, our ChatQA-70B can outperform GPT-4 in terms of average score on 10 conversational QA datasets (54.14 vs. 53.90), without relying on any synthetic data from OpenAI GPT models.

ChatQA：打造 GPT-4 等級的對話式問答模型

ChatQA: Building GPT-4 Level Conversational QA Models

摘要

Support