劣化モデル化マルチパス拡散による可変メタレンズ撮影

要旨

メタレンズは超小型計算イメージングにおいて大きな可能性を秘めていますが、複雑な光学的劣化と計算的復元の困難さという課題に直面しています。既存の手法は、通常、精密な光学キャリブレーションや大規模なペアデータセットに依存しており、現実世界のイメージングシステムにとっては非自明です。さらに、推論プロセスに対する制御の欠如が、望ましくない幻覚的アーティファクトを引き起こすことがよくあります。本論文では、大規模データセットの代わりに事前学習済みモデルから得られる強力な自然画像事前分布を活用し、調整可能なメタレンズ写真撮影のためのDegradation-Modeled Multipath Diffusionを提案します。私たちのフレームワークは、高周波ディテール生成、構造的忠実度、メタレンズ固有の劣化抑制をバランスよく行うために、ポジティブ、ニュートラル、ネガティブプロンプトパスを使用し、疑似データ拡張を併用します。調整可能なデコーダにより、忠実度と知覚品質の間の制御されたトレードオフが可能です。さらに、空間的に変化する劣化認識注意（SVDA）モジュールが、複雑な光学的およびセンサー誘起劣化を適応的にモデル化します。最後に、現実世界での検証のためにミリメートルスケールのMetaCameraを設計・構築しました。広範な結果は、私たちのアプローチが最先端の手法を上回り、高忠実度でシャープな画像再構成を実現することを示しています。詳細資料: https://dmdiff.github.io/

English

Metalenses offer significant potential for ultra-compact computational imaging but face challenges from complex optical degradation and computational restoration difficulties. Existing methods typically rely on precise optical calibration or massive paired datasets, which are non-trivial for real-world imaging systems. Furthermore, a lack of control over the inference process often results in undesirable hallucinated artifacts. We introduce Degradation-Modeled Multipath Diffusion for tunable metalens photography, leveraging powerful natural image priors from pretrained models instead of large datasets. Our framework uses positive, neutral, and negative-prompt paths to balance high-frequency detail generation, structural fidelity, and suppression of metalens-specific degradation, alongside pseudo data augmentation. A tunable decoder enables controlled trade-offs between fidelity and perceptual quality. Additionally, a spatially varying degradation-aware attention (SVDA) module adaptively models complex optical and sensor-induced degradation. Finally, we design and build a millimeter-scale MetaCamera for real-world validation. Extensive results show that our approach outperforms state-of-the-art methods, achieving high-fidelity and sharp image reconstruction. More materials: https://dmdiff.github.io/.

劣化モデル化マルチパス拡散による可変メタレンズ撮影

Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

要旨

Support