AutoMat:通过智能工具使用实现显微镜图像的自动化晶体结构重建
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
May 19, 2025
作者: Yaotian Yang, Yiwen Tang, Yizhe Chen, Xiao Chen, Jiangjie Qiu, Hao Xiong, Haoyu Yin, Zhiyao Luo, Yifei Zhang, Sijia Tao, Wentao Li, Qinghua Zhang, Yuqiang Li, Wanli Ouyang, Bin Zhao, Xiaonan Wang, Fei Wei
cs.AI
摘要
基于机器学习的原子间势能和力场高度依赖于精确的原子结构,然而,由于实验解析晶体的有限性,此类数据极为稀缺。尽管原子分辨率电子显微镜为结构数据提供了潜在来源,但将这些图像转换为模拟就绪格式仍是一项劳动密集且易出错的工作,成为模型训练与验证的瓶颈。我们推出了AutoMat,一个端到端、由智能体辅助的流程,它能够自动将扫描透射电子显微镜(STEM)图像转化为原子晶体结构,并预测其物理性质。AutoMat集成了模式自适应去噪、物理引导的模板检索、对称性感知的原子重建、通过MatterSim实现的快速弛豫与性质预测,以及各阶段间的协调调度。我们为此任务提出了首个专用基准STEM2Mat-Bench,并通过晶格均方根偏差(RMSD)、形成能平均绝对误差(MAE)及结构匹配成功率来评估性能。通过协调外部工具调用,AutoMat使得仅依赖文本的大型语言模型(LLM)在该领域超越了视觉-语言模型,实现了整个流程的闭环推理。在涵盖450个结构样本的大规模实验中,AutoMat显著优于现有的多模态大型语言模型及工具。这些成果不仅验证了AutoMat与STEM2Mat-Bench的有效性,也标志着在材料科学中连接显微技术与原子模拟的关键一步。代码与数据集已公开于https://github.com/yyt-2378/AutoMat 和 https://huggingface.co/datasets/yaotianvector/STEM2Mat。
English
Machine learning-based interatomic potentials and force fields depend
critically on accurate atomic structures, yet such data are scarce due to the
limited availability of experimentally resolved crystals. Although
atomic-resolution electron microscopy offers a potential source of structural
data, converting these images into simulation-ready formats remains
labor-intensive and error-prone, creating a bottleneck for model training and
validation. We introduce AutoMat, an end-to-end, agent-assisted pipeline that
automatically transforms scanning transmission electron microscopy (STEM)
images into atomic crystal structures and predicts their physical properties.
AutoMat combines pattern-adaptive denoising, physics-guided template retrieval,
symmetry-aware atomic reconstruction, fast relaxation and property prediction
via MatterSim, and coordinated orchestration across all stages. We propose the
first dedicated STEM2Mat-Bench for this task and evaluate performance using
lattice RMSD, formation energy MAE, and structure-matching success rate. By
orchestrating external tool calls, AutoMat enables a text-only LLM to
outperform vision-language models in this domain, achieving closed-loop
reasoning throughout the pipeline. In large-scale experiments over 450
structure samples, AutoMat substantially outperforms existing multimodal large
language models and tools. These results validate both AutoMat and
STEM2Mat-Bench, marking a key step toward bridging microscopy and atomistic
simulation in materials science.The code and dataset are publicly available at
https://github.com/yyt-2378/AutoMat and
https://huggingface.co/datasets/yaotianvector/STEM2Mat.Summary
AI-Generated Summary