UniSHARP：通用清晰单目视图合成

摘要

在本研究中，我们致力于扩展SHARP——这一流行的逼真视图合成方法，实现从传统透视相机到广角、鱼眼及全景球面等连续相机系统的通用单目渲染。为突破SHARP基于针孔模型的假设，我们的核心思想是将各类图像统一对齐到一个全景隐空间。为此，我们提出UniSHARP，在特征空间和高斯空间中进行隐式对齐。具体而言，高斯基元沿着射线和径向距离排布，形成基于射线的通用表示；同时，从UniK3D启发编码器中提取的2D语义特征与3D空间特征被联合解码，以生成完整的高斯点云。为全面评估该方法，我们构建了一个覆盖多场景、多成像系统的基准数据集，并根据视场角进行分层，以实现对通用单目渲染任务的细粒度评估。在提出的基准上进行的广泛实验表明，UniSHARP表现出色，大幅优于其他方法。项目页面见：https://insta360-research-team.github.io/Unisharp-website/

English

In this work, we focus on extending SHARP, the popular photorealistic view synthesis method, for universal monocular rendering across a continuum of camera systems, from conventional perspective cameras to wide-field-of-view, fisheye and omnidirectional panoramic settings. To overcome the pinhole-specific assumptions of SHARP, our key idea is to align various images in a unified omnidirectional latent space. Thus, we propose UniSHARP, which performs implicit alignment in both feature and Gaussian spaces. Specifically, Gaussian primitives are arranged along rays and radial distances in a ray-based universal representation, while 2D semantic and 3D spatial features extracted from UniK3D-inspired encoders are jointly decoded to generate the complete Gaussian cloud. To comprehensively evaluate our method, we construct a benchmark covering diverse imaging systems across various scenes. The benchmark is further stratified by field of view (FoV) to enable fine-grained assessment of the universal monocular rendering task. Extensive experiments on the proposed benchmark demonstrate the effectiveness of UniSHARP, outperforming alternative methods by a large margin. The project page can be found at: https://insta360-research-team.github.io/Unisharp-website/