オープンソースの世界モデルの進化

要旨

我々は、映像生成技術を基盤としたオープンソースの世界シミュレーター「LingBot-World」を発表します。トップクラスの世界モデルとして位置づけられるLingBot-Worldは、以下の特徴を備えています。(1) 写実的環境、科学的文脈、アニメ調など多様な環境において、高精細かつ堅牢なダイナミクスを維持します。(2) 分単位の長時間生成を可能にしながら、時間経過に伴う文脈の一貫性（「長期記憶」とも呼ばれる）を保持します。(3) 毎秒16フレーム生成時のレイテンシーを1秒未満に抑え、リアルタイム相互動作を実現します。オープンソースとクローズドソース技術の格差を縮めるため、コードとモデルを公開します。本リリースがコンテンツ制作、ゲーム、ロボット学習などの分野で実用的な応用を促進すると確信しています。

English

We present LingBot-World, an open-sourced world simulator stemming from video generation. Positioned as a top-tier world model, LingBot-World offers the following features. (1) It maintains high fidelity and robust dynamics in a broad spectrum of environments, including realism, scientific contexts, cartoon styles, and beyond. (2) It enables a minute-level horizon while preserving contextual consistency over time, which is also known as "long-term memory". (3) It supports real-time interactivity, achieving a latency of under 1 second when producing 16 frames per second. We provide public access to the code and model in an effort to narrow the divide between open-source and closed-source technologies. We believe our release will empower the community with practical applications across areas like content creation, gaming, and robot learning.

オープンソースの世界モデルの進化

Advancing Open-source World Models

要旨

Support