FitMe: フォトリアルな3Dモーファブルモデルアバターの深層学習

要旨

本論文では、FitMeを紹介する。FitMeは、単一または複数の画像から高精細なレンダリング可能な人間のアバターを取得するために使用できる、顔の反射率モデルと微分可能なレンダリング最適化パイプラインである。このモデルは、拡散反射と鏡面反射の観点から顔の外観を捉えるマルチモーダルなスタイルベースのジェネレータと、PCAベースの形状モデルで構成されている。我々は、最適化パイプラインで使用可能な高速な微分可能なレンダリングプロセスを採用し、同時に写実的な顔のシェーディングを実現している。最適化プロセスでは、スタイルベースの潜在表現と形状モデルの表現力を活用することで、顔の反射率と形状を高精度に捉える。FitMeは、単一の「イン・ザ・ワイルド」顔画像において、最先端の反射率取得とアイデンティティ保存を達成し、同じアイデンティティに属する複数の制約のない顔画像が与えられた場合には、スキャンに匹敵する印象的な結果を生成する。最近の暗黙的なアバター再構成手法とは対照的に、FitMeはわずか1分で再照明可能なメッシュとテクスチャベースのアバターを生成し、エンドユーザーアプリケーションで使用することができる。

English

In this paper, we introduce FitMe, a facial reflectance model and a differentiable rendering optimization pipeline, that can be used to acquire high-fidelity renderable human avatars from single or multiple images. The model consists of a multi-modal style-based generator, that captures facial appearance in terms of diffuse and specular reflectance, and a PCA-based shape model. We employ a fast differentiable rendering process that can be used in an optimization pipeline, while also achieving photorealistic facial shading. Our optimization process accurately captures both the facial reflectance and shape in high-detail, by exploiting the expressivity of the style-based latent representation and of our shape model. FitMe achieves state-of-the-art reflectance acquisition and identity preservation on single "in-the-wild" facial images, while it produces impressive scan-like results, when given multiple unconstrained facial images pertaining to the same identity. In contrast with recent implicit avatar reconstructions, FitMe requires only one minute and produces relightable mesh and texture-based avatars, that can be used by end-user applications.

FitMe: フォトリアルな3Dモーファブルモデルアバターの深層学習

FitMe: Deep Photorealistic 3D Morphable Model Avatars

要旨

Support