ChatPaper.aiChatPaper

基於降解模型的多路徑擴散可調金屬鏡攝影技術

Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

June 28, 2025
作者: Jianing Zhang, Jiayi Zhu, Feiyu Ji, Xiaokang Yang, Xiaoyun Yuan
cs.AI

摘要

金属透鏡在超緊湊計算成像方面展現出巨大潛力,但面臨著複雜光學退化與計算復原困難的挑戰。現有方法通常依賴於精確的光學校準或大量配對數據集,這對於現實世界的成像系統而言並非易事。此外,對推理過程缺乏控制往往導致不期望的幻覺偽影。我們引入了退化建模多路徑擴散技術,用於可調金屬透鏡攝影,利用預訓練模型中的強大自然圖像先驗,而非依賴大規模數據集。我們的框架採用正提示、中性提示和負提示路徑,以平衡高頻細節生成、結構保真度以及抑制金屬透鏡特有的退化,並結合偽數據增強。一個可調解碼器實現了在保真度與感知質量之間的可控權衡。此外,空間變化退化感知注意力(SVDA)模塊自適應地建模了複雜的光學和傳感器引起的退化。最後,我們設計並構建了一款毫米級別的MetaCamera,用於現實世界的驗證。大量結果表明,我們的方法超越了現有最先進技術,實現了高保真度和銳利的圖像重建。更多材料請訪問:https://dmdiff.github.io/。
English
Metalenses offer significant potential for ultra-compact computational imaging but face challenges from complex optical degradation and computational restoration difficulties. Existing methods typically rely on precise optical calibration or massive paired datasets, which are non-trivial for real-world imaging systems. Furthermore, a lack of control over the inference process often results in undesirable hallucinated artifacts. We introduce Degradation-Modeled Multipath Diffusion for tunable metalens photography, leveraging powerful natural image priors from pretrained models instead of large datasets. Our framework uses positive, neutral, and negative-prompt paths to balance high-frequency detail generation, structural fidelity, and suppression of metalens-specific degradation, alongside pseudo data augmentation. A tunable decoder enables controlled trade-offs between fidelity and perceptual quality. Additionally, a spatially varying degradation-aware attention (SVDA) module adaptively models complex optical and sensor-induced degradation. Finally, we design and build a millimeter-scale MetaCamera for real-world validation. Extensive results show that our approach outperforms state-of-the-art methods, achieving high-fidelity and sharp image reconstruction. More materials: https://dmdiff.github.io/.
PDF21July 1, 2025