台風 T1: オープンタイの推論モデル

要旨

この論文では、オープンなタイの推論モデルを開発するための取り組みであるTyphoon T1を紹介します。推論モデルは、大規模言語モデル（LLMs）の上に構築された比較的新しいタイプの生成モデルです。推論モデルは、最終的な答えに至るまでの長い思考の連鎖を生成し、複雑なタスクでのパフォーマンス向上が確認されています。ただし、特に低リソース言語でトレースを生成できる推論モデルの開発に関する詳細は限られています。Typhoon T1は、強化学習ではなくオープンデータセットを活用した教師付きファインチューニングにより、推論モデルをより効率的に開発する詳細に踏み込むオープンな取り組みを提供します。この論文では、合成データの生成とトレーニング、データセットとモデルの重みについて共有します。さらに、ドメインを横断し一般化でき、低リソース言語で推論トレースを生成できる推論モデルの開発から得られた洞察を提供し、タイ語を例に挙げます。このオープンな取り組みが、この分野でのさらなる研究の基盤となることを期待しています。

English

This paper introduces Typhoon T1, an open effort to develop an open Thai reasoning model. A reasoning model is a relatively new type of generative model built on top of large language models (LLMs). A reasoning model generates a long chain of thought before arriving at a final answer, an approach found to improve performance on complex tasks. However, details on developing such a model are limited, especially for reasoning models that can generate traces in a low-resource language. Typhoon T1 presents an open effort that dives into the details of developing a reasoning model in a more cost-effective way by leveraging supervised fine-tuning using open datasets, instead of reinforcement learning. This paper shares the details about synthetic data generation and training, as well as our dataset and model weights. Additionally, we provide insights gained from developing a reasoning model that generalizes across domains and is capable of generating reasoning traces in a low-resource language, using Thai as an example. We hope this open effort provides a foundation for further research in this field.

台風 T1: オープンタイの推論モデル

Typhoon T1: An Open Thai Reasoning Model

要旨

Support