ChatQA:打造 GPT-4 等級的對話式問答模型
ChatQA: Building GPT-4 Level Conversational QA Models
January 18, 2024
作者: Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Mohammad Shoeybi, Bryan Catanzaro
cs.AI
摘要
在這份研究中,我們介紹了ChatQA,這是一系列的對話式問答(QA)模型,能夠達到GPT-4的準確度水平。具體來說,我們提出了一種兩階段的指導調整方法,可以顯著改善從大型語言模型(LLMs)中獲得的零-shot對話式QA結果。為了處理對話式QA中的檢索,我們在多輪QA數據集上對密集檢索器進行微調,這提供了與使用最先進的查詢重寫模型相當的結果,同時大幅降低了部署成本。值得注意的是,我們的ChatQA-70B在10個對話式QA數據集的平均分數方面可以優於GPT-4(54.14比53.90),而且不依賴於OpenAI GPT模型的任何合成數據。
English
In this work, we introduce ChatQA, a family of conversational question
answering (QA) models, that obtain GPT-4 level accuracies. Specifically, we
propose a two-stage instruction tuning method that can significantly improve
the zero-shot conversational QA results from large language models (LLMs). To
handle retrieval in conversational QA, we fine-tune a dense retriever on a
multi-turn QA dataset, which provides comparable results to using the
state-of-the-art query rewriting model while largely reducing deployment cost.
Notably, our ChatQA-70B can outperform GPT-4 in terms of average score on 10
conversational QA datasets (54.14 vs. 53.90), without relying on any synthetic
data from OpenAI GPT models.