Verso una cattura pratica di avatar rilucenti ad alta fedeltà

Abstract

In questo articolo, proponiamo un nuovo framework, Tracking-free Relightable Avatar (TRAvatar), per la cattura e la ricostruzione di avatar 3D ad alta fedeltà. Rispetto ai metodi precedenti, TRAvatar opera in un contesto più pratico ed efficiente. Nello specifico, TRAvatar viene addestrato con sequenze di immagini dinamiche catturate in un Light Stage sotto diverse condizioni di illuminazione, consentendo un'illuminazione realistica e un'animazione in tempo reale per avatar in scenari diversi. Inoltre, TRAvatar permette la cattura di avatar senza tracciamento e elimina la necessità di un tracciamento accurato della superficie in condizioni di illuminazione variabili. I nostri contributi sono duplici: in primo luogo, proponiamo una nuova architettura di rete che si basa esplicitamente e garantisce il rispetto della natura lineare dell'illuminazione. Addestrato su semplici acquisizioni di luce di gruppo, TRAvatar può prevedere l'aspetto in tempo reale con un singolo passaggio in avanti, ottenendo effetti di illuminazione di alta qualità sotto mappe ambientali arbitrarie. In secondo luogo, ottimizziamo congiuntamente la geometria facciale e l'aspetto illuminabile da zero basandoci su sequenze di immagini, dove il tracciamento viene appreso implicitamente. Questo approccio senza tracciamento conferisce robustezza per stabilire corrispondenze temporali tra i fotogrammi sotto diverse condizioni di illuminazione. Esperimenti qualitativi e quantitativi estesi dimostrano che il nostro framework raggiunge prestazioni superiori per l'animazione fotorealistica degli avatar e l'illuminazione.

English

In this paper, we propose a novel framework, Tracking-free Relightable Avatar (TRAvatar), for capturing and reconstructing high-fidelity 3D avatars. Compared to previous methods, TRAvatar works in a more practical and efficient setting. Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes. Additionally, TRAvatar allows for tracking-free avatar capture and obviates the need for accurate surface tracking under varying illumination conditions. Our contributions are two-fold: First, we propose a novel network architecture that explicitly builds on and ensures the satisfaction of the linear nature of lighting. Trained on simple group light captures, TRAvatar can predict the appearance in real-time with a single forward pass, achieving high-quality relighting effects under illuminations of arbitrary environment maps. Second, we jointly optimize the facial geometry and relightable appearance from scratch based on image sequences, where the tracking is implicitly learned. This tracking-free approach brings robustness for establishing temporal correspondences between frames under different lighting conditions. Extensive qualitative and quantitative experiments demonstrate that our framework achieves superior performance for photorealistic avatar animation and relighting.

Verso una cattura pratica di avatar rilucenti ad alta fedeltà

Towards Practical Capture of High-Fidelity Relightable Avatars

Abstract

Support