ChatPaper.aiChatPaper

學習發現用於基因表達預測的調控元件

Learning to Discover Regulatory Elements for Gene Expression Prediction

February 19, 2025
作者: Xingyu Su, Haiyang Yu, Degui Zhi, Shuiwang Ji
cs.AI

摘要

我們探討了從DNA序列預測基因表達的問題。此任務的一個關鍵挑戰在於識別控制基因表達的調控元件。本文介紹了Seq2Exp,一種序列到表達的網絡,其設計旨在發現並提取驅動目標基因表達的調控元件,從而提高基因表達預測的準確性。我們的方法捕捉了表觀基因組信號、DNA序列及其相關調控元件之間的因果關係。具體而言,我們提出基於因果活性調控元件分解表觀基因組信號與DNA序列,並應用帶有Beta分佈的信息瓶頸來結合它們的效應,同時過濾掉非因果成分。實驗結果表明,Seq2Exp在基因表達預測任務中優於現有基準,並與常用的峰值檢測統計方法(如MACS3)相比,發現了更具影響力的區域。源代碼已作為AIRS庫的一部分發布(https://github.com/divelab/AIRS/)。
English
We consider the problem of predicting gene expressions from DNA sequences. A key challenge of this task is to find the regulatory elements that control gene expressions. Here, we introduce Seq2Exp, a Sequence to Expression network explicitly designed to discover and extract regulatory elements that drive target gene expression, enhancing the accuracy of the gene expression prediction. Our approach captures the causal relationship between epigenomic signals, DNA sequences and their associated regulatory elements. Specifically, we propose to decompose the epigenomic signals and the DNA sequence conditioned on the causal active regulatory elements, and apply an information bottleneck with the Beta distribution to combine their effects while filtering out non-causal components. Our experiments demonstrate that Seq2Exp outperforms existing baselines in gene expression prediction tasks and discovers influential regions compared to commonly used statistical methods for peak detection such as MACS3. The source code is released as part of the AIRS library (https://github.com/divelab/AIRS/).

Summary

AI-Generated Summary

PDF22February 24, 2025