基於語意進程函式的影片分析與生成

摘要

圖像與影片生成模型所產生的轉換過程往往呈現高度非線性演變：內容長時間幾乎不變的平穩段後，會突然出現急遽的語義跳躍。為分析並修正此現象，我們引入語義進程函數——一種能捕捉給定序列中語義隨時間演變的一維表徵。針對每個影格，我們計算語義嵌入間的距離，並擬合出一條反映序列中累積語義變化的平滑曲線。該曲線偏離直線的程度揭示了語義節奏的不均勻性。基於此發現，我們提出語義線性化方法，通過對序列重新參數化（或重定時序），使語義變化以恆定速率展開，從而產生更平滑、連貫的過渡效果。除線性化外，我們的框架還提供了模型無關的基礎架構，可用於識別時間維度的異常、比較不同生成器的語義節奏，並將生成影片與真實世界影片序列引導至任意目標節奏。

English

Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or retimes) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.

基於語意進程函式的影片分析與生成

Video Analysis and Generation via a Semantic Progress Function

摘要

Support