URHand: ユニバーサル・リライタブル・ハンズ

要旨

既存のフォトリアルな再照明可能な手モデルは、異なる視点、ポーズ、照明条件下での特定の個人に基づく大量の観測データを必要とし、自然な照明や新しい個人への一般化において課題に直面しています。このギャップを埋めるため、我々はURHandを提案します。これは、視点、ポーズ、照明、個人を跨いで一般化する初のユニバーサルな再照明可能な手モデルです。本モデルは、スマートフォンで撮影した画像を用いた数ショットでのパーソナライズを可能とし、新しい照明条件下でのフォトリアルなレンダリングを実現します。パーソナライズプロセスを簡素化しつつフォトリアリズムを維持するため、我々は数百の個人の手をライトステージでマルチビュー撮影したデータに基づく強力なユニバーサル再照明可能な事前モデルを構築しました。鍵となる課題は、自然照明下での一般化を損なうことなく、個人ごとの忠実度と鮮明なディテールを維持しつつ、個人を跨いだトレーニングをスケールアップすることです。この目的のために、我々は物理ベースのシェーディングを入力特徴量として取り込む空間的に変化する線形照明モデルをニューラルレンダラーとして提案します。非線形活性化関数とバイアスを除去することで、我々が特別に設計した照明モデルは光輸送の線形性を明示的に保持します。これにより、ライトステージデータからのシングルステージトレーニングが可能となり、多様な個人にわたる任意の連続照明下でのリアルタイムレンダリングへの一般化を実現します。さらに、物理ベースモデルと我々のニューラル再照明モデルの共同学習を導入し、忠実度と一般化をさらに向上させます。大規模な実験により、我々のアプローチが品質と一般化性の両面で既存手法を凌駕する優れた性能を達成することを示します。また、未見の個人の短時間のスマートフォンスキャンからのURHandの迅速なパーソナライズも実証します。

English

Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations. To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities. The key challenge is scaling the cross-identity training while maintaining personalized fidelity and sharp details without compromising generalization under natural illuminations. To this end, we propose a spatially varying linear lighting model as the neural renderer that takes physics-inspired shading as input feature. By removing non-linear activations and bias, our specifically designed lighting model explicitly keeps the linearity of light transport. This enables single-stage training from light-stage data while generalizing to real-time rendering under arbitrary continuous illuminations across diverse identities. In addition, we introduce the joint learning of a physically based model and our neural relighting model, which further improves fidelity and generalization. Extensive experiments show that our approach achieves superior performance over existing methods in terms of both quality and generalizability. We also demonstrate quick personalization of URHand from a short phone scan of an unseen identity.

URHand: ユニバーサル・リライタブル・ハンズ

URHand: Universal Relightable Hands

要旨

Support