GaussianObject: Basta Quattro Immagini per Ottenere un Oggetto 3D di Alta Qualità con Gaussian Splatting

Abstract

La ricostruzione e il rendering di oggetti 3D a partire da viste estremamente sparse è di fondamentale importanza per promuovere le applicazioni delle tecniche di visione 3D e migliorare l'esperienza utente. Tuttavia, le immagini provenienti da viste sparse contengono informazioni 3D molto limitate, portando a due sfide significative: 1) Difficoltà nel costruire una coerenza multi-vista poiché le immagini disponibili per il matching sono troppo poche; 2) Informazioni parzialmente omesse o altamente compresse sull'oggetto a causa di una copertura insufficiente delle viste. Per affrontare queste sfide, proponiamo GaussianObject, un framework per rappresentare e renderizzare l'oggetto 3D utilizzando lo splatting Gaussiano, che raggiunge un'elevata qualità di rendering con solo 4 immagini in input. Introduciamo inizialmente tecniche di visual hull e eliminazione dei floater, che iniettano esplicitamente priorità strutturali nel processo di ottimizzazione iniziale per aiutare a costruire la coerenza multi-vista, ottenendo una rappresentazione Gaussiana 3D approssimativa. Successivamente, costruiamo un modello di riparazione Gaussiano basato su modelli di diffusione per integrare le informazioni omesse sull'oggetto, dove i Gaussiani vengono ulteriormente raffinati. Progettiamo una strategia di auto-generazione per ottenere coppie di immagini per l'addestramento del modello di riparazione. Il nostro GaussianObject viene valutato su diversi dataset impegnativi, tra cui MipNeRF360, OmniObject3D e OpenIllumination, ottenendo risultati di ricostruzione robusti a partire da sole 4 viste e superando significativamente i precedenti metodi all'avanguardia.

English

Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting, that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination which explicitly inject structure priors into the initial optimization process for helping build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. Our GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, and OpenIllumination, achieving strong reconstruction results from only 4 views and significantly outperforming previous state-of-the-art methods.

GaussianObject: Basta Quattro Immagini per Ottenere un Oggetto 3D di Alta Qualità con Gaussian Splatting

GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

Abstract

Support