ChatPaper.aiChatPaper

h2oGPT:民主化大型语言模型

h2oGPT: Democratizing Large Language Models

June 13, 2023
作者: Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler, Chun Ming Lee, Marcos V. Conde, Pasha Stetsenko, Olivier Grellier, SriSatish Ambati
cs.AI

摘要

基于生成式预训练变换器(GPT)的大型语言模型(LLM)如GPT-4代表了人工智能领域的一场革命,这归功于它们在自然语言处理中的实际应用。然而,它们也带来了许多重大风险,比如存在偏见、私密或有害文本,以及未经授权的包含受版权保护材料。 我们推出了h2oGPT,这是一个基于生成式预训练变换器(GPT)的大型语言模型(LLM)的开源代码库套件,用于创建和使用这些模型。该项目的目标是创建世界上最好的真正开源替代闭源GPT的方案。通过与开源社区的合作,我们开源了几个经过精细调整的h2oGPT模型,参数范围从70亿到400亿,可在完全宽松的Apache 2.0许可下商业使用。我们的发布中包含了使用自然语言进行100%私密文档搜索的功能。 开源语言模型有助于推动人工智能的发展,并使其更加易于获取和可信赖。它们降低了准入门槛,使个人和团体能够根据自身需求定制这些模型。这种开放性增加了创新、透明度和公平性。共享人工智能的好处需要一种开源策略,H2O.ai将继续推动人工智能和大型语言模型的民主化,以公平地分享人工智能的好处。
English
Foundation Large Language Models (LLMs) such as GPT-4 represent a revolution in AI due to their real-world applications though natural language processing. However, they also pose many significant risks such as the presence of biased, private, or harmful text, and the unauthorized inclusion of copyrighted material. We introduce h2oGPT, a suite of open-source code repositories for the creation and use of Large Language Models (LLMs) based on Generative Pretrained Transformers (GPTs). The goal of this project is to create the world's best truly open-source alternative to closed-source GPTs. In collaboration with and as part of the incredible and unstoppable open-source community, we open-source several fine-tuned h2oGPT models from 7 to 40 Billion parameters, ready for commercial use under fully permissive Apache 2.0 licenses. Included in our release is 100% private document search using natural language. Open-source language models help boost AI development and make it more accessible and trustworthy. They lower entry hurdles, allowing people and groups to tailor these models to their needs. This openness increases innovation, transparency, and fairness. An open-source strategy is needed to share AI benefits fairly, and H2O.ai will continue to democratize AI and LLMs.
PDF184December 15, 2024