DreamDiffusion: 뇌 EEG 신호로부터 고품질 이미지 생성하기

초록

본 논문은 사고를 텍스트로 변환할 필요 없이 뇌파(EEG) 신호로부터 직접 고품질 이미지를 생성하는 새로운 방법인 DreamDiffusion을 소개한다. DreamDiffusion은 사전 학습된 텍스트-이미지 모델을 활용하고, 효과적이고 강건한 EEG 표현을 위해 시간적 마스크 신호 모델링을 사용하여 EEG 인코더를 사전 학습한다. 또한, 이 방법은 CLIP 이미지 인코더를 추가로 활용하여 제한된 EEG-이미지 쌍에서 EEG, 텍스트, 이미지 임베딩을 더 잘 정렬할 수 있도록 추가적인 지도를 제공한다. 전반적으로, 제안된 방법은 노이즈, 제한된 정보, 개인차와 같은 EEG 신호를 이미지 생성에 사용할 때의 어려움을 극복하고, 유망한 결과를 달성한다. 정량적 및 정성적 결과는 이 방법이 휴대 가능하고 저비용의 "사고-이미지" 변환을 위한 중요한 진전임을 보여주며, 신경과학과 컴퓨터 비전 분야에서의 잠재적 응용 가능성을 제시한다.

English

This paper introduces DreamDiffusion, a novel method for generating high-quality images directly from brain electroencephalogram (EEG) signals, without the need to translate thoughts into text. DreamDiffusion leverages pre-trained text-to-image models and employs temporal masked signal modeling to pre-train the EEG encoder for effective and robust EEG representations. Additionally, the method further leverages the CLIP image encoder to provide extra supervision to better align EEG, text, and image embeddings with limited EEG-image pairs. Overall, the proposed method overcomes the challenges of using EEG signals for image generation, such as noise, limited information, and individual differences, and achieves promising results. Quantitative and qualitative results demonstrate the effectiveness of the proposed method as a significant step towards portable and low-cost ``thoughts-to-image'', with potential applications in neuroscience and computer vision.

DreamDiffusion: 뇌 EEG 신호로부터 고품질 이미지 생성하기

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

초록

Support