セマンティック進行関数による映像分析と生成

要旨

画像および動画生成モデルによって生み出される変容は、多くの場合、高度に非線形な様相を呈します。内容がほとんど変化しない長い区間が続いた後、突然の意味的な飛躍が生じるのです。この挙動を分析・補正するため、我々は意味的進行関数（Semantic Progress Function）を提案します。これは、所与のシーケンスにおける意味内容が時間とともにどのように進展するかを捉える一次元表現です。各フレームにおいて、意味的埋め込み間の距離を計算し、シーケンス全体の累積的な意味的変化を反映する滑らかな曲線をフィッティングします。この曲線が直線から逸脱することで、不均一な意味的進行ペースが明らかになります。この知見に基づき、我々は意味的線形化手法を提案します。この手法ではシーケンスを再パラメータ化（または再タイミング）し、意味的変化が一定の速度で展開されるようにすることで、より滑らかで一貫性のある遷移を実現します。線形化を超えて、本フレームワークは、時間的な不規則性の特定、異なる生成モデル間の意味的ペーシングの比較、生成動画および実世界の動画シーケンスを任意の目標ペーシングへ誘導するための、モデルに依存しない基盤を提供します。

English

Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or retimes) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.

セマンティック進行関数による映像分析と生成

Video Analysis and Generation via a Semantic Progress Function

要旨

Support