Herbelichtbare en Animeerbare Neurale Avatar uit Sparse-View Video

Samenvatting

Dit artikel richt zich op de uitdaging om relightbare en animeerbare neurale avatars te creëren vanuit sparse-view (of zelfs monoscopische) video's van dynamische mensen onder onbekende belichting. Vergeleken met studio-omgevingen is deze setting praktischer en toegankelijker, maar vormt het een extreem uitdagend ill-posed probleem. Eerdere methoden voor neurale menselijke reconstructie kunnen animeerbare avatars reconstrueren vanuit sparse views met behulp van vervormde Signed Distance Fields (SDF), maar zijn niet in staat materiaalparameters te herstellen voor relighting. Hoewel differentieerbare inverse rendering-gebaseerde methoden succesvol zijn geweest in het herstellen van materialen van statische objecten, is het niet eenvoudig om deze uit te breiden naar dynamische mensen, omdat het rekenintensief is om pixel-oppervlakte-intersecties en lichtzichtbaarheid te berekenen op vervormde SDF's voor inverse rendering. Om deze uitdaging op te lossen, stellen we een Hierarchical Distance Query (HDQ)-algoritme voor om de afstanden in de wereldruimte onder willekeurige menselijke poses te benaderen. Specifiek schatten we grove afstanden op basis van een parametrisch menselijk model en berekenen we fijne afstanden door gebruik te maken van de lokale vervormingsinvariantie van SDF. Op basis van het HDQ-algoritme maken we gebruik van sphere tracing om efficiënt de oppervlakte-intersectie en lichtzichtbaarheid te schatten. Dit stelt ons in staat het eerste systeem te ontwikkelen dat animeerbare en relightbare neurale avatars kan herstellen vanuit sparse view (of monoscopische) invoer. Experimenten tonen aan dat onze aanpak superieure resultaten oplevert in vergelijking met state-of-the-art methoden. Onze code zal worden vrijgegeven voor reproduceerbaarheid.

English

This paper tackles the challenge of creating relightable and animatable neural avatars from sparse-view (or even monocular) videos of dynamic humans under unknown illumination. Compared to studio environments, this setting is more practical and accessible but poses an extremely challenging ill-posed problem. Previous neural human reconstruction methods are able to reconstruct animatable avatars from sparse views using deformed Signed Distance Fields (SDF) but cannot recover material parameters for relighting. While differentiable inverse rendering-based methods have succeeded in material recovery of static objects, it is not straightforward to extend them to dynamic humans as it is computationally intensive to compute pixel-surface intersection and light visibility on deformed SDFs for inverse rendering. To solve this challenge, we propose a Hierarchical Distance Query (HDQ) algorithm to approximate the world space distances under arbitrary human poses. Specifically, we estimate coarse distances based on a parametric human model and compute fine distances by exploiting the local deformation invariance of SDF. Based on the HDQ algorithm, we leverage sphere tracing to efficiently estimate the surface intersection and light visibility. This allows us to develop the first system to recover animatable and relightable neural avatars from sparse view (or monocular) inputs. Experiments demonstrate that our approach is able to produce superior results compared to state-of-the-art methods. Our code will be released for reproducibility.

Herbelichtbare en Animeerbare Neurale Avatar uit Sparse-View Video

Relightable and Animatable Neural Avatar from Sparse-View Video

Samenvatting

Support