OpenFlamingo：一個用於訓練大型自回歸視覺語言模型的開源框架。

摘要

我們介紹了 OpenFlamingo，這是一系列從 3B 到 9B 參數的自回歸視覺語言模型。OpenFlamingo 是一個持續進行的工作，旨在製作 DeepMind 的 Flamingo 模型的開源複製品。在七個視覺語言數據集上，OpenFlamingo 模型的表現平均在 80% 到 89% 之間。本技術報告描述了我們的模型、訓練數據、超參數和評估套件。我們在 https://github.com/mlfoundations/open_flamingo 分享我們的模型和代碼。

English

We introduce OpenFlamingo, a family of autoregressive vision-language models ranging from 3B to 9B parameters. OpenFlamingo is an ongoing effort to produce an open-source replication of DeepMind's Flamingo models. On seven vision-language datasets, OpenFlamingo models average between 80 - 89% of corresponding Flamingo performance. This technical report describes our models, training data, hyperparameters, and evaluation suite. We share our models and code at https://github.com/mlfoundations/open_flamingo.

OpenFlamingo：一個用於訓練大型自回歸視覺語言模型的開源框架。

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

摘要

Support