ChatPaper.aiChatPaper

工业视觉仿真到现实中的先验可用性:CAD引导与无CAD模式综述

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

May 28, 2026
作者: Chenxi Tao, Seung-Kyum Choi
cs.AI

摘要

工业视觉中的模拟到现实迁移常被描述为从合成图像到真实图像的转换,但工业部署通常涉及可用证据与所需决策之间更广泛的错配。系统可能基于CAD渲染图、模拟RGB-D观测、标准参考图像、合成缺陷、预训练特征空间或语言提示构建,却在不同的传感器、光照、材料、夹具、标定、生产变异及罕见缺陷模式下部署。本综述将工业视觉模拟到现实迁移重新定义为一种依据先验可用性组织的领域差距问题。我们区分了三种设置:CAD可用设置,其中显式物体几何可支持渲染、标定、位姿估计、分割及测试时的几何验证;CAD不可用设置,其中几何被标准参考外观、特征分布、师生残差、合成异常假设、基础模型特征或视觉语言先验取代;以及边界先验设置,其中近似模型、模板、参考视图或语义对应仅保留CAD的部分作用。这种框架将基于CAD的检测和六自由度位姿估计文献,与通常被分开综述的工业异常及表面检测文献联系起来。为使分类具体化,我们使用T-LESS/BOP、MVTec AD和VisA上的实证锚点。这些锚点表明,仅靠CAD渲染数量无法弥合迁移差距;源分布设计、检测器容量及少量真实标定可能更为重要。它们还表明,测试时的CAD通过掩码、位姿和深度一致性创建了一个独特的验证通道,而CAD不可用的检测则依赖于标定的正态性和特征偏差。因此,本综述反对单一的跨任务排行榜,转而探讨什么先验支撑了部署决策。
English
Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under different sensors, lighting, materials, fixtures, calibration, production variation, and rare defect modes. This review reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. We distinguish CAD-available settings, where explicit object geometry can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification; CAD-unavailable settings, where geometry is replaced by normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomaly assumptions, foundation features, or vision-language priors; and boundary-prior settings, where approximate models, templates, reference views, or semantic correspondences preserve only part of the CAD role. This framing connects CAD-based detection and 6D pose-estimation literature with industrial anomaly and surface-inspection literature that is usually reviewed separately. To make the taxonomy concrete, we use empirical anchors on T-LESS/BOP, MVTec AD, and VisA. The anchors show that CAD render count alone does not close transfer; source-distribution design, detector capacity, and small real calibration can matter more. They also show that CAD at test time creates a distinct verification channel through mask, pose, and depth consistency, whereas CAD-unavailable inspection relies on calibrated normality and feature deviation. The review therefore argues against a single cross-task leaderboard and instead asks what prior grounds the deployment decision.