ChatPaper.aiChatPaper

《芦荟家族》开放与专业化医疗大语言模型构建指南

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

May 7, 2025
作者: Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés
cs.AI

摘要

目的:随着大型语言模型(LLMs)在医疗领域的进步,开发具有竞争力的开源模型以保护公众利益的需求日益凸显。本研究通过优化数据预处理和训练的关键阶段,展示了如何通过直接偏好优化(DPO)提升模型安全性,以及通过检索增强生成(RAG)提高模型效能,为开源医疗LLM领域做出了贡献。所采用的评估方法,包括四种不同类型的测试,为该领域设定了新标准。最终发布的模型在性能上与最佳私有替代品相当,并以宽松许可协议发布。 方法:基于Llama 3.1和Qwen 2.5等强大基础模型,Aloe Beta利用定制数据集,通过合成思维链示例增强公共数据。模型经过直接偏好优化对齐,在面临越狱攻击时强调伦理与政策一致性表现。评估包括封闭式、开放式、安全性及人工评估,以最大化结果的可靠性。 结果:基于Aloe系列模型的稳健表现,提出了贯穿整个流程的优化建议。这些模型在医疗基准测试和多个医学领域中展现出竞争力,并常为医疗专业人士所青睐。在偏见和毒性方面,Aloe Beta模型显著提升了安全性,对未知越狱攻击表现出较强抵抗力。为确保负责任地发布,Aloe系列模型附有详细的医疗领域风险评估。 结论:Aloe Beta模型及其开发方法,为开源医疗LLM领域做出了重要贡献,在提供顶尖性能的同时,坚守高伦理标准。本工作为医疗领域开发与报告对齐LLM设立了新标杆。
English
Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which includes four different types of tests, defines a new standard for the field. The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license. Methods: Building on top of strong base models like Llama 3.1 and Qwen 2.5, Aloe Beta uses a custom dataset to enhance public data with synthetic Chain of Thought examples. The models undergo alignment with Direct Preference Optimization, emphasizing ethical and policy-aligned performance in the presence of jailbreaking attacks. Evaluation includes close-ended, open-ended, safety and human assessments, to maximize the reliability of results. Results: Recommendations are made across the entire pipeline, backed by the solid performance of the Aloe Family. These models deliver competitive performance across healthcare benchmarks and medical fields, and are often preferred by healthcare professionals. On bias and toxicity, the Aloe Beta models significantly improve safety, showing resilience to unseen jailbreaking attacks. For a responsible release, a detailed risk assessment specific to healthcare is attached to the Aloe Family models. Conclusion: The Aloe Beta models, and the recipe that leads to them, are a significant contribution to the open-source medical LLM field, offering top-of-the-line performance while maintaining high ethical requirements. This work sets a new standard for developing and reporting aligned LLMs in healthcare.

Summary

AI-Generated Summary

PDF191May 21, 2025