鴨嘴獸：LLM 快速、便宜且強大的優化

摘要

我們介紹了 Platypus，這是一系列經過精細調整和合併的大型語言模型（LLMs），在HuggingFace的Open LLM排行榜中表現最佳，並且目前位居第一。在這份工作中，我們描述了以下內容：（1）我們精心挑選的數據集 Open-Platypus，這是其他開放數據集的子集，並向公眾發布；（2）我們對 LoRA 模塊進行精細調整和合併的過程，以保留預訓練LLMs的強大先驗知識，同時凸顯特定領域知識；（3）我們在檢查測試數據洩漏和訓練數據污染方面的努力，這可以為未來研究提供信息。具體來說，Platypus家族在各種模型大小的定量LLM指標中表現出色，在僅使用其他最先進的精細調整LLMs所需的一小部分調整數據和總體計算的情況下，領先於全球Open LLM排行榜。特別是，一個 13B 的 Platypus 模型可以在單個 A100 GPU 上使用 25k 個問題在 5 小時內訓練。這證明了我們的 Open-Platypus 數據集的質量，並為該領域的更多改進提供了機會。項目頁面：https://platypus-llm.github.io

English

We present Platypus, a family of fine-tuned and merged Large Language Models (LLMs) that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard as of the release date of this work. In this work we describe (1) our curated dataset Open-Platypus, that is a subset of other open datasets and which we release to the public (2) our process of fine-tuning and merging LoRA modules in order to conserve the strong prior of pretrained LLMs, while bringing specific domain knowledge to the surface (3) our efforts in checking for test data leaks and contamination in the training data, which can inform future research. Specifically, the Platypus family achieves strong performance in quantitative LLM metrics across model sizes, topping the global Open LLM leaderboard while using just a fraction of the fine-tuning data and overall compute that are required for other state-of-the-art fine-tuned LLMs. In particular, a 13B Platypus model can be trained on a single A100 GPU using 25k questions in 5 hours. This is a testament of the quality of our Open-Platypus dataset, and opens opportunities for more improvements in the field. Project page: https://platypus-llm.github.io

鴨嘴獸：LLM 快速、便宜且強大的優化

Platypus: Quick, Cheap, and Powerful Refinement of LLMs

摘要

Support