ChatPaper.aiChatPaper

MEGConformer:基于Conformer的脑磁图解码器,实现稳健的语音与音素分类

MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification

December 1, 2025
作者: Xabier de Zuazo, Ibon Saratxaga, Eva Navas
cs.AI

摘要

我们为2025年LibriBrain PNPL竞赛提出基于Conformer架构的解码器,针对两项基础性脑磁图任务:语音检测与音素分类。本方法采用紧凑型Conformer架构处理原始306通道脑磁图信号,配备轻量级卷积投影层和任务专用输出头。在语音检测任务中,我们首次探索了面向脑磁图信号的SpecAugment数据增强技术。对于音素分类任务,采用逆平方根类别权重与动态分组加载器来处理百样本平均后的数据。此外,简单的实例级归一化技术对缓解留出数据集上的分布偏移至关重要。基于官方标准赛道划分方案并采用宏平均F1分数进行模型选择,我们的最佳系统在排行榜上分别获得88.9%(语音检测)和65.8%(音素分类)的成绩,超越竞赛基线并在两项任务中均位列前十。具体实现细节、技术文档、源代码及模型检查点详见https://github.com/neural2speech/libribrain-experiments。
English
We present Conformer-based decoders for the LibriBrain 2025 PNPL competition, targeting two foundational MEG tasks: Speech Detection and Phoneme Classification. Our approach adapts a compact Conformer to raw 306-channel MEG signals, with a lightweight convolutional projection layer and task-specific heads. For Speech Detection, a MEG-oriented SpecAugment provided a first exploration of MEG-specific augmentation. For Phoneme Classification, we used inverse-square-root class weighting and a dynamic grouping loader to handle 100-sample averaged examples. In addition, a simple instance-level normalization proved critical to mitigate distribution shifts on the holdout split. Using the official Standard track splits and F1-macro for model selection, our best systems achieved 88.9% (Speech) and 65.8% (Phoneme) on the leaderboard, surpassing the competition baselines and ranking within the top-10 in both tasks. For further implementation details, the technical documentation, source code, and checkpoints are available at https://github.com/neural2speech/libribrain-experiments.
PDF01December 3, 2025