AutoTrain: 最先端モデルのためのノーコードトレーニング

要旨

オープンソースモデルの進歩により、カスタムデータセットでのモデルのトレーニング（またはファインチューニング）は、特定の産業用途やオープンソースアプリケーションに適したソリューションを開発する上で重要な要素となっています。しかし、異なる種類のモダリティやタスクにわたるトレーニングプロセスを簡素化するツールは存在していません。私たちは、AutoTrain（別名AutoTrain Advanced）を紹介します。これは、さまざまな種類のタスクにモデルをトレーニング（またはファインチューニング）するために使用できる、ノーコードのオープンソースツール／ライブラリです。これらのタスクには、大規模言語モデル（LLM）のファインチューニング、テキスト分類／回帰、トークン分類、シーケンス対シーケンスタスク、文の変換モデルのファインチューニング、ビジュアル言語モデル（VLM）のファインチューニング、画像分類／回帰、さらには表形式データの分類と回帰タスクが含まれます。AutoTrain Advancedは、カスタムデータセットでのモデルトレーニングのベストプラクティスを提供するオープンソースライブラリです。このライブラリは、https://github.com/huggingface/autotrain-advanced で入手可能です。AutoTrainは、完全なローカルモードまたはクラウドマシンで使用でき、Hugging Face Hubで共有されている何万ものモデルとそのバリエーションと連携して機能しまいます。

English

With the advancements in open-source models, training (or finetuning) models on custom datasets has become a crucial part of developing solutions which are tailored to specific industrial or open-source applications. Yet, there is no single tool which simplifies the process of training across different types of modalities or tasks. We introduce AutoTrain (aka AutoTrain Advanced) -- an open-source, no code tool/library which can be used to train (or finetune) models for different kinds of tasks such as: large language model (LLM) finetuning, text classification/regression, token classification, sequence-to-sequence task, finetuning of sentence transformers, visual language model (VLM) finetuning, image classification/regression and even classification and regression tasks on tabular data. AutoTrain Advanced is an open-source library providing best practices for training models on custom datasets. The library is available at https://github.com/huggingface/autotrain-advanced. AutoTrain can be used in fully local mode or on cloud machines and works with tens of thousands of models shared on Hugging Face Hub and their variations.

AutoTrain: 最先端モデルのためのノーコードトレーニング

AutoTrain: No-code training for state-of-the-art models

要旨

Support