MaGGIe：蒙版引导渐进式人体实例抠图

摘要

人像抠图是图像和视频处理中的基础任务，用于从输入中提取人类前景像素。先前的研究要么通过额外的引导来提高准确性，要么通过改进单个实例跨帧的时间一致性。我们提出了一个新的框架 MaGGIe，即Masked Guided Gradual Human Instance Matting，它可以逐步为每个人类实例预测 alpha 抠图，同时保持计算成本、精度和一致性。我们的方法利用现代架构，包括 Transformer 注意力和稀疏卷积，以在不增加内存和延迟的情况下同时输出所有实例抠图。尽管在多实例场景下保持恒定的推理成本，我们的框架在我们提出的合成基准测试中实现了稳健且多才多艺的性能。通过更高质量的图像和视频抠图基准测试，我们引入了来自公开来源的新型多实例合成方法，以增加模型在现实场景中的泛化能力。

English

Human matting is a foundation task in image and video processing, where human foreground pixels are extracted from the input. Prior works either improve the accuracy by additional guidance or improve the temporal consistency of a single instance across frames. We propose a new framework MaGGIe, Masked Guided Gradual Human Instance Matting, which predicts alpha mattes progressively for each human instances while maintaining the computational cost, precision, and consistency. Our method leverages modern architectures, including transformer attention and sparse convolution, to output all instance mattes simultaneously without exploding memory and latency. Although keeping constant inference costs in the multiple-instance scenario, our framework achieves robust and versatile performance on our proposed synthesized benchmarks. With the higher quality image and video matting benchmarks, the novel multi-instance synthesis approach from publicly available sources is introduced to increase the generalization of models in real-world scenarios.

MaGGIe：蒙版引导渐进式人体实例抠图

MaGGIe: Masked Guided Gradual Human Instance Matting

摘要

Support