ChatPaper.aiChatPaper

移动电话上利用摄像头融合实现高效混合变焦

Efficient Hybrid Zoom using Camera Fusion on Mobile Phones

January 2, 2024
作者: Xiaotong Wu, Wei-Sheng Lai, YiChang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang
cs.AI

摘要

单反相机可以通过调整镜头距离或更换镜头类型实现多个变焦级别。然而,由于空间限制,智能手机设备无法采用这些技术。大多数智能手机制造商采用混合变焦系统:通常是在低变焦级别使用广角(W)摄像头,在高变焦级别使用长焦(T)摄像头。为了模拟W和T之间的变焦级别,这些系统会裁剪并对W摄像头的图像进行数字上采样,导致显著的细节丢失。在本文中,我们提出了一种在移动设备上进行混合变焦超分辨率的高效系统,该系统捕获同步的W和T拍摄,并利用机器学习模型来对齐并从T传输细节到W。我们进一步开发了一种自适应混合方法,考虑了景深不匹配、场景遮挡、流不确定性和对齐错误。为了最小化领域差距,我们设计了一个双手机摄像头架来捕获真实世界的输入和监督训练的地面真相。我们的方法在移动平台上在500毫秒内生成一张1200万像素的图像,并在真实场景的广泛评估中与最先进的方法进行了有利的比较。
English
DSLR cameras can achieve multiple zoom levels via shifting lens distances or swapping lens types. However, these techniques are not possible on smartphone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems crop and digitally upsample images from W, leading to significant detail loss. In this paper, we propose an efficient system for hybrid zoom super-resolution on mobile devices, which captures a synchronous pair of W and T shots and leverages machine learning models to align and transfer details from T to W. We further develop an adaptive blending method that accounts for depth-of-field mismatches, scene occlusion, flow uncertainty, and alignment errors. To minimize the domain gap, we design a dual-phone camera rig to capture real-world inputs and ground-truths for supervised training. Our method generates a 12-megapixel image in 500ms on a mobile platform and compares favorably against state-of-the-art methods under extensive evaluation on real-world scenarios.
PDF92December 15, 2024