DreamDiffusion: 脳波（EEG）信号から高品質な画像を生成する技術

要旨

本論文では、思考をテキストに変換する必要なく、脳波（EEG）信号から直接高品質な画像を生成する新しい手法であるDreamDiffusionを紹介する。DreamDiffusionは、事前学習済みのテキストから画像への生成モデルを活用し、時間的マスク信号モデリングを用いてEEGエンコーダを事前学習することで、効果的かつロバストなEEG表現を実現する。さらに、この手法はCLIP画像エンコーダを活用して追加の監督を提供し、限られたEEG-画像ペアにおいてEEG、テキスト、および画像の埋め込みをより良く整合させる。全体として、提案手法は、ノイズ、情報の制限、個人差といったEEG信号を画像生成に使用する際の課題を克服し、有望な結果を達成している。定量的および定性的な結果は、提案手法の有効性を示しており、携帯可能で低コストな「思考から画像へ」の実現に向けた重要な一歩として、神経科学やコンピュータビジョンにおける潜在的な応用が期待される。

English

This paper introduces DreamDiffusion, a novel method for generating high-quality images directly from brain electroencephalogram (EEG) signals, without the need to translate thoughts into text. DreamDiffusion leverages pre-trained text-to-image models and employs temporal masked signal modeling to pre-train the EEG encoder for effective and robust EEG representations. Additionally, the method further leverages the CLIP image encoder to provide extra supervision to better align EEG, text, and image embeddings with limited EEG-image pairs. Overall, the proposed method overcomes the challenges of using EEG signals for image generation, such as noise, limited information, and individual differences, and achieves promising results. Quantitative and qualitative results demonstrate the effectiveness of the proposed method as a significant step towards portable and low-cost ``thoughts-to-image'', with potential applications in neuroscience and computer vision.

DreamDiffusion: 脳波（EEG）信号から高品質な画像を生成する技術

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

要旨

Support