Relightify: Beleuchtbare 3D-Gesichter aus einem einzelnen Bild mittels Diffusionsmodellen

Zusammenfassung

Nach dem bemerkenswerten Erfolg von Diffusionsmodellen bei der Bildgenerierung haben neuere Arbeiten auch ihre beeindruckende Fähigkeit gezeigt, eine Reihe von inversen Problemen auf unüberwachte Weise zu lösen, indem der Sampling-Prozess basierend auf einem konditionierenden Eingabewert entsprechend eingeschränkt wird. Motiviert durch diese Erkenntnisse präsentieren wir in diesem Artikel den ersten Ansatz, der Diffusionsmodelle als Prior für die hochpräzise 3D-Rekonstruktion des Gesichts-BRDFs aus einem einzelnen Bild verwendet. Wir beginnen mit der Nutzung eines hochwertigen UV-Datensatzes von Gesichtsreflektanz (diffuse und spekulare Albedo sowie Normalen), den wir unter variierenden Beleuchtungseinstellungen rendern, um natürliche RGB-Texturen zu simulieren, und trainieren dann ein unbedingtes Diffusionsmodell auf verketteten Paaren von gerenderten Texturen und Reflektanzkomponenten. Zum Testzeitpunkt passen wir ein 3D-Morphable-Modell an das gegebene Bild an und entfalten das Gesicht in einer partiellen UV-Textur. Durch das Sampling aus dem Diffusionsmodell, während der beobachtete Texturteil intakt bleibt, füllt das Modell nicht nur die selbstverdeckten Bereiche, sondern auch die unbekannten Reflektanzkomponenten in einer einzigen Sequenz von Denoising-Schritten aus. Im Gegensatz zu bestehenden Methoden erwerben wir die beobachtete Textur direkt aus dem Eingabebild, was zu einer treueren und konsistenteren Reflektanzschätzung führt. Durch eine Reihe von qualitativen und quantitativen Vergleichen demonstrieren wir eine überlegene Leistung sowohl bei der Texturvervollständigung als auch bei der Reflektanzrekonstruktion.

English

Following the remarkable success of diffusion models on image generation, recent works have also demonstrated their impressive ability to address a number of inverse problems in an unsupervised way, by properly constraining the sampling process based on a conditioning input. Motivated by this, in this paper, we present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image. We start by leveraging a high-quality UV dataset of facial reflectance (diffuse and specular albedo and normals), which we render under varying illumination settings to simulate natural RGB textures and, then, train an unconditional diffusion model on concatenated pairs of rendered textures and reflectance components. At test time, we fit a 3D morphable model to the given image and unwrap the face in a partial UV texture. By sampling from the diffusion model, while retaining the observed texture part intact, the model inpaints not only the self-occluded areas but also the unknown reflectance components, in a single sequence of denoising steps. In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent reflectance estimation. Through a series of qualitative and quantitative comparisons, we demonstrate superior performance in both texture completion as well as reflectance reconstruction tasks.

Relightify: Beleuchtbare 3D-Gesichter aus einem einzelnen Bild mittels Diffusionsmodellen

Relightify: Relightable 3D Faces from a Single Image via Diffusion Models

Zusammenfassung

Support