ChatPaper.aiChatPaper

Dedelayed:通过设备端校正消除远程推理延迟

Dedelayed: Deleting remote inference delay via on-device correction

October 15, 2025
作者: Dan Jacobellis, Mateen Ulhaq, Fabien Racapé, Hyomin Choi, Neeraja J. Yadwadkar
cs.AI

摘要

远程推理使得轻量级设备能够利用强大的云端模型。然而,通信网络延迟导致预测结果滞后,不适用于实时任务。为解决这一问题,我们提出了Dedelayed,一种延迟校正方法,能够缓解任意远程推理延迟,使本地设备能够实时生成低延迟输出。我们的方法采用了一个轻量级本地模型来处理当前帧,并融合由重量级远程模型从过去帧计算出的特征。在BDD100K驾驶数据集的视频上,Dedelayed在所有超过33毫秒的现实通信网络延迟下,均优于仅本地或仅远程的基线模型,提升了语义分割的准确性。在不引入额外延迟的情况下,与完全本地推理相比,其准确性提高了6.4 mIoU,与远程推理相比提高了9.8 mIoU,在100毫秒的往返延迟下。在更长延迟和更高运动场景下,这一优势更为显著,因为延迟缓解的分割推理更有效地保持了准确性,为必须与当前世界状态保持一致的实时任务提供了明显优势。
English
Remote inference allows lightweight devices to leverage powerful cloud models. However, communication network latency makes predictions stale and unsuitable for real-time tasks. To address this, we introduce Dedelayed, a delay-corrective method that mitigates arbitrary remote inference delays, allowing the local device to produce low-latency outputs in real time. Our method employs a lightweight local model that processes the current frame and fuses in features that a heavyweight remote model computes from past frames. On video from the BDD100K driving dataset, Dedelayed improves semantic segmentation accuracy over the stronger of the local-only and remote-only baselines across all realistic communication network delays beyond 33 ms. Without incurring additional delay, it improves accuracy by 6.4 mIoU compared to fully local inference and 9.8 mIoU compared to remote inference, for a round-trip delay of 100 ms. The advantage grows under longer delays and higher-motion scenes, as delay-mitigated split inference sustains accuracy more effectively, providing clear advantages for real-time tasks that must remain aligned with the current world state.
PDF12October 16, 2025