MDAgent2: 분자 동역학 코드 생성 및 지식 질의응답을 위한 대규모 언어 모델

초록

분자 동역학(MD) 시뮬레이션은 재료 과학에서 원자 규모 거동을 이해하는 데 필수적이지만, LAMMPS 스크립트 작성은 여전히 매우 전문적이고 시간 소모적인 작업입니다. 대규모 언어 모델(LLM)이 코드 생성 및 도메인 특화 질의응답 분야에서 잠재력을 보여주고 있으나, MD 시나리오에서의 성능은 부족한 도메인 데이터, 최첨단 LLM의 높은 배포 비용, 그리고 낮은 코드 실행 가능성에 의해 제한됩니다. 기존 MDAgent 연구를 기반으로, 우리는 MD 도메인 내에서 지식 질의응답과 코드 생성을 모두 수행할 수 있는 첫 번째 종단간(end-to-end) 프레임워크인 MDAgent2를 제시합니다. 우리는 MD 지식, 질의응답, 코드 생성을 아우르는 세 가지 고품질 데이터셋을 생성하는 도메인 특화 데이터 구축 파이프라인을 구축했습니다. 이러한 데이터셋을 바탕으로, 우리는 지속 사전 학습(CPT), 지도 미세 조정(SFT), 강화 학습(RL)의 3단계 사후 학습 전략을 채택하여 도메인에 적응된 두 모델, MD-Instruct와 MD-Code를 학습시켰습니다. 더 나아가, 시뮬레이션 결과를 보상 신호로 활용하고 낮은 보상 궤적을 재활용하여 지속적 개선을 이루는 폐쇄형 RL 방법인 MD-GRPO를 도입했습니다. 또한 코드 생성, 실행, 평가, 자가 수정을 통합한 배포 가능한 다중 에이전트 시스템인 MDAgent2-RUNTIME을 구축했습니다. 본 연구에서 제안된 LAMMPS 코드 생성 및 질의응답을 위한 첫 번째 벤치마크인 MD-EvalBench과 함께, 우리의 모델과 시스템은 여러 강력한 기준 모델들을 능가하는 성능을 달성했습니다. 이 작업은 산업 시뮬레이션 작업에서 대규모 언어 모델의 적응성과 일반화 능력을 체계적으로 입증하며, AI for Science 및 산업 규모 시뮬레이션 분야의 자동 코드 생성을 위한 방법론적 기초를 마련합니다. URL: https://github.com/FredericVAN/PKU_MDAgent2

English

Molecular dynamics (MD) simulations are essential for understanding atomic-scale behaviors in materials science, yet writing LAMMPS scripts remains highly specialized and time-consuming tasks. Although LLMs show promise in code generation and domain-specific question answering, their performance in MD scenarios is limited by scarce domain data, the high deployment cost of state-of-the-art LLMs, and low code executability. Building upon our prior MDAgent, we present MDAgent2, the first end-to-end framework capable of performing both knowledge Q&A and code generation within the MD domain. We construct a domain-specific data-construction pipeline that yields three high-quality datasets spanning MD knowledge, question answering, and code generation. Based on these datasets, we adopt a three stage post-training strategy--continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL)--to train two domain-adapted models, MD-Instruct and MD-Code. Furthermore, we introduce MD-GRPO, a closed-loop RL method that leverages simulation outcomes as reward signals and recycles low-reward trajectories for continual refinement. We further build MDAgent2-RUNTIME, a deployable multi-agent system that integrates code generation, execution, evaluation, and self-correction. Together with MD-EvalBench proposed in this work, the first benchmark for LAMMPS code generation and question answering, our models and system achieve performance surpassing several strong baselines.This work systematically demonstrates the adaptability and generalization capability of large language models in industrial simulation tasks, laying a methodological foundation for automatic code generation in AI for Science and industrial-scale simulations. URL: https://github.com/FredericVAN/PKU_MDAgent2

MDAgent2: 분자 동역학 코드 생성 및 지식 질의응답을 위한 대규모 언어 모델

MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

초록

Support