ChatPaper.aiChatPaper

通过子频率流形遍历的频率引导动作扩散

Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal

May 27, 2026
作者: Junlin Wang
cs.AI

摘要

通过行为克隆学习视觉运动策略通常涉及模仿由人类操作员收集的专家演示数据。然而,人类自然演示中固有地包含高频噪声,例如间歇性抖动、停顿和动作震颤。直接训练策略来模仿这些原始轨迹的模型不可避免地会继承这些次优行为。这种缺陷在基于扩散的策略中尤为明显,因为迭代去噪步骤可能会无意中放大高频伪影,从而牺牲有意义的细微细节。为解决这些限制,我们提出了一种新颖的基于频率的算法,能够实现隐式频谱调控与平滑动作生成。我们的方法——频率引导算子(FGO),通过逐步引导含噪样本经过频谱带逐渐扩展的中间子频率流形,从而操控扩散策略的生成过程。在来自5个基准的15项机器人操作任务上的验证表明,FGO在增强动作平滑性和时间一致性方面取得了优越性能,同时保留了成功执行任务所需的细节。项目网站:https://henrywjl.github.io/frequency-guidance-operator/
English
Learning visuomotor policies via behavior cloning typically involves mimicking expert demonstrations collected by human operators. However, natural human demonstrations inherently contain high-frequency noise, such as intermittent jerks, pauses, and action jitter. Training policies to directly imitate these raw trajectories inevitably causes the model to inherit these suboptimal behaviors. This pathology is particularly pronounced in diffusion-based policies, where iterative denoising steps can inadvertently amplify high-frequency artifacts at the expense of meaningful fine-grained details. To address these limitations, we present a novel frequency-based algorithm that enables implicit spectral maneuvering and smooth action generation. Our method, Frequency Guidance Operator (FGO), steers the generation process of diffusion polices by progressively driving the noisy samples through intermediate sub-frequency manifolds with expanding spectral bands. Validated on 15 robotic manipulation tasks from 5 benchmarks, FGO achieves superior performance in enhancing action smoothness and temporal consistency while preserving the details necessary for successful task execution. Project website: https://henrywjl.github.io/frequency-guidance-operator/