FIT: フィットを考慮した仮想試着のための大規模データセット

要旨

人物画像と衣服画像が与えられたとき、仮想試着（VTO）は、人物の元の姿勢とアイデンティティを保ちながら、その人物が衣服を着用したリアルな画像を合成することを目的とします。近年のVTO手法は衣服の外観の可視化に優れているものの、試着体験の重要な側面、すなわち衣服のフィット感の正確さ（例えば、特大のシャツが特小サイズの人にどのように見えるかを描写すること）はほとんど考慮されていません。この問題の主な障壁は、特に衣服が明らかに大きすぎる、または小さすぎる「不適切なフィット」の場合において、正確な衣服および身体サイズ情報を提供するデータセットが存在しないことです。その結果、現在のVTO手法は、衣服や人物のサイズに関わらず、適切にフィットした結果を生成することがデフォルトとなっています。本論文では、この未解決問題への解決に向けた第一歩を踏み出します。我々は、FIT（Fit-Inclusive Try-on）と呼ばれる、正確な身体および衣服の計測値と共に110万以上の試着画像トリプレットから構成される大規模VTOデータセットを提案します。我々は、スケーラブルな合成戦略によりデータ収集の課題を克服しました：（1）GarmentCodeを用いてプログラム的に3D衣服を生成し、物理シミュレーションを介してドレープさせ、現実的な衣服のフィット感を捉えます。（2）幾何学形状を厳密に保存しながら、合成レンダリングを写真的にリアルな画像に変換する新しいリテクスチャリングフレームワークを採用します。（3）教師あり学習のための対となる人物画像（同一人物、異なる衣服）を生成するために、リテクスチャリングモデルに人物のアイデンティティ保存機能を新規に導入します。最後に、我々はFITデータセットを活用して、フィット感を考慮したベースライン仮想試着モデルを学習させます。我々のデータと結果は、フィット感を考慮した仮想試着の新たなstate-of-the-artを確立するとともに、将来の研究のための堅牢なベンチマークを提供します。すべてのデータとコードはプロジェクトページ（https://johannakarras.github.io/FIT ）で公開予定です。

English

Given a person and a garment image, virtual try-on (VTO) aims to synthesize a realistic image of the person wearing the garment, while preserving their original pose and identity. Although recent VTO methods excel at visualizing garment appearance, they largely overlook a crucial aspect of the try-on experience: the accuracy of garment fit -- for example, depicting how an extra-large shirt looks on an extra-small person. A key obstacle is the absence of datasets that provide precise garment and body size information, particularly for "ill-fit" cases, where garments are significantly too large or too small. Consequently, current VTO methods default to generating well-fitted results regardless of the garment or person size. In this paper, we take the first steps towards solving this open problem. We introduce FIT (Fit-Inclusive Try-on), a large-scale VTO dataset comprising over 1.13M try-on image triplets accompanied by precise body and garment measurements. We overcome the challenges of data collection via a scalable synthetic strategy: (1) We programmatically generate 3D garments using GarmentCode and drape them via physics simulation to capture realistic garment fit. (2) We employ a novel re-texturing framework to transform synthetic renderings into photorealistic images while strictly preserving geometry. (3) We introduce person identity preservation into our re-texturing model to generate paired person images (same person, different garments) for supervised training. Finally, we leverage our FIT dataset to train a baseline fit-aware virtual try-on model. Our data and results set the new state-of-the-art for fit-aware virtual try-on, as well as offer a robust benchmark for future research. We will make all data and code publicly available on our project page: https://johannakarras.github.io/FIT.

FIT: フィットを考慮した仮想試着のための大規模データセット

FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On

要旨

Support