ChatPaper.aiChatPaper

生成式表徵指導調整

Generative Representational Instruction Tuning

February 15, 2024
作者: Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela
cs.AI

摘要

所有基於文本的語言問題都可以歸納為生成或嵌入其中。目前的模型只在其中一個方面表現良好。我們引入了生成表徵指導調整(GRIT),通過指導區分它們,訓練一個大型語言模型來處理生成和嵌入任務。與其他開放模型相比,我們的結果 GritLM 7B 在大型文本嵌入基準測試(MTEB)上創下了新的最先進水平,並在各種生成任務中優於其大小的所有模型。通過進一步擴展,GritLM 8x7B 在仍然是最佳嵌入模型之一的情況下,優於我們嘗試過的所有開放生成語言模型。值得注意的是,我們發現 GRIT 與僅在生成或嵌入數據上進行訓練的效果相當,因此我們可以在不損失性能的情況下將兩者統一起來。通過 GRIT 進行統一,可以加快對於長文檔的檢索增強生成(RAG)速度超過 60%,因為不再需要獨立的檢索和生成模型。模型、代碼等均可在 https://github.com/ContextualAI/gritlm 免費獲取。
English
All text-based language problems can be reduced to either generation or embedding. Current models only perform well at one or the other. We introduce generative representational instruction tuning (GRIT) whereby a large language model is trained to handle both generative and embedding tasks by distinguishing between them through instructions. Compared to other open models, our resulting GritLM 7B sets a new state of the art on the Massive Text Embedding Benchmark (MTEB) and outperforms all models up to its size on a range of generative tasks. By scaling up further, GritLM 8x7B outperforms all open generative language models that we tried while still being among the best embedding models. Notably, we find that GRIT matches training on only generative or embedding data, thus we can unify both at no performance loss. Among other benefits, the unification via GRIT speeds up Retrieval-Augmented Generation (RAG) by > 60% for long documents, by no longer requiring separate retrieval and generation models. Models, code, etc. are freely available at https://github.com/ContextualAI/gritlm.
PDF555December 15, 2024