스테레오전자학이 적용된 분자 그래프를 활용하여 분자 기계(학습된) 표현 발전하기

초록

분자 표현은 물리적 세계를 이해하는 데 기초적인 요소입니다. 그 중요성은 화학 반응의 기초부터 새로운 치료제 및 물질의 설계에 이르기까지 다양합니다. 이전의 분자 기계 학습 모델은 문자열, 지문, 전역 특징 및 정보가 희박한 특성을 지닌 간단한 분자 그래프를 사용해 왔습니다. 그러나 예측 작업의 복잡성이 증가함에 따라 분자 표현은 더 높은 충실도의 정보를 인코딩해야 합니다. 본 연구는 입체전자효과를 통해 양자화학적 풍부한 정보를 분자 그래프에 주입하는 새로운 접근 방식을 소개합니다. 우리는 입체전자 상호작용을 명시적으로 추가함으로써 분자 기계 학습 모델의 성능을 크게 향상시킬 수 있음을 보여줍니다. 게다가, 입체전자가 주입된 표현은 맞춤형 이중 그래프 신경망 워크플로우를 통해 학습하고 적용할 수 있어, 어떠한 하류 분자 기계 학습 작업에도 적용될 수 있습니다. 마지막으로, 학습된 표현이 전체 단백질과 같은 이전에 해결하기 어려웠던 시스템의 쉬운 입체전자 평가를 가능하게 하며, 분자 설계의 새로운 길을 열어줍니다.

English

Molecular representation is a foundational element in our understanding of the physical world. Its importance ranges from the fundamentals of chemical reactions to the design of new therapies and materials. Previous molecular machine learning models have employed strings, fingerprints, global features, and simple molecular graphs that are inherently information-sparse representations. However, as the complexity of prediction tasks increases, the molecular representation needs to encode higher fidelity information. This work introduces a novel approach to infusing quantum-chemical-rich information into molecular graphs via stereoelectronic effects. We show that the explicit addition of stereoelectronic interactions significantly improves the performance of molecular machine learning models. Furthermore, stereoelectronics-infused representations can be learned and deployed with a tailored double graph neural network workflow, enabling its application to any downstream molecular machine learning task. Finally, we show that the learned representations allow for facile stereoelectronic evaluation of previously intractable systems, such as entire proteins, opening new avenues of molecular design.

스테레오전자학이 적용된 분자 그래프를 활용하여 분자 기계(학습된) 표현 발전하기

Advancing Molecular Machine (Learned) Representations with Stereoelectronics-Infused Molecular Graphs

초록

Support