OpenFlamingo: 대규모 자기회귀 시각-언어 모델 학습을 위한 오픈소스 프레임워크

초록

우리는 3B에서 9B 파라미터 규모의 자동회귀적 시각-언어 모델군인 OpenFlamingo를 소개합니다. OpenFlamingo는 DeepMind의 Flamingo 모델을 오픈소스로 재현하기 위한 지속적인 노력의 결과물입니다. 7개의 시각-언어 데이터셋에서 OpenFlamingo 모델들은 해당 Flamingo 모델 성능의 평균 80~89% 수준을 보여줍니다. 본 기술 보고서에서는 우리의 모델, 학습 데이터, 하이퍼파라미터, 그리고 평가 도구에 대해 설명합니다. 모델과 코드는 https://github.com/mlfoundations/open_flamingo에서 공유하고 있습니다.

English

We introduce OpenFlamingo, a family of autoregressive vision-language models ranging from 3B to 9B parameters. OpenFlamingo is an ongoing effort to produce an open-source replication of DeepMind's Flamingo models. On seven vision-language datasets, OpenFlamingo models average between 80 - 89% of corresponding Flamingo performance. This technical report describes our models, training data, hyperparameters, and evaluation suite. We share our models and code at https://github.com/mlfoundations/open_flamingo.

OpenFlamingo: 대규모 자기회귀 시각-언어 모델 학습을 위한 오픈소스 프레임워크

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

초록

Support