AutoMat：通过智能工具使用实现显微镜图像的自动化晶体结构重建

摘要

基于机器学习的原子间势能和力场高度依赖于精确的原子结构，然而，由于实验解析晶体的有限性，此类数据极为稀缺。尽管原子分辨率电子显微镜为结构数据提供了潜在来源，但将这些图像转换为模拟就绪格式仍是一项劳动密集且易出错的工作，成为模型训练与验证的瓶颈。我们推出了AutoMat，一个端到端、由智能体辅助的流程，它能够自动将扫描透射电子显微镜（STEM）图像转化为原子晶体结构，并预测其物理性质。AutoMat集成了模式自适应去噪、物理引导的模板检索、对称性感知的原子重建、通过MatterSim实现的快速弛豫与性质预测，以及各阶段间的协调调度。我们为此任务提出了首个专用基准STEM2Mat-Bench，并通过晶格均方根偏差（RMSD）、形成能平均绝对误差（MAE）及结构匹配成功率来评估性能。通过协调外部工具调用，AutoMat使得仅依赖文本的大型语言模型（LLM）在该领域超越了视觉-语言模型，实现了整个流程的闭环推理。在涵盖450个结构样本的大规模实验中，AutoMat显著优于现有的多模态大型语言模型及工具。这些成果不仅验证了AutoMat与STEM2Mat-Bench的有效性，也标志着在材料科学中连接显微技术与原子模拟的关键一步。代码与数据集已公开于https://github.com/yyt-2378/AutoMat 和 https://huggingface.co/datasets/yaotianvector/STEM2Mat。

English

Machine learning-based interatomic potentials and force fields depend critically on accurate atomic structures, yet such data are scarce due to the limited availability of experimentally resolved crystals. Although atomic-resolution electron microscopy offers a potential source of structural data, converting these images into simulation-ready formats remains labor-intensive and error-prone, creating a bottleneck for model training and validation. We introduce AutoMat, an end-to-end, agent-assisted pipeline that automatically transforms scanning transmission electron microscopy (STEM) images into atomic crystal structures and predicts their physical properties. AutoMat combines pattern-adaptive denoising, physics-guided template retrieval, symmetry-aware atomic reconstruction, fast relaxation and property prediction via MatterSim, and coordinated orchestration across all stages. We propose the first dedicated STEM2Mat-Bench for this task and evaluate performance using lattice RMSD, formation energy MAE, and structure-matching success rate. By orchestrating external tool calls, AutoMat enables a text-only LLM to outperform vision-language models in this domain, achieving closed-loop reasoning throughout the pipeline. In large-scale experiments over 450 structure samples, AutoMat substantially outperforms existing multimodal large language models and tools. These results validate both AutoMat and STEM2Mat-Bench, marking a key step toward bridging microscopy and atomistic simulation in materials science.The code and dataset are publicly available at https://github.com/yyt-2378/AutoMat and https://huggingface.co/datasets/yaotianvector/STEM2Mat.

AutoMat：通过智能工具使用实现显微镜图像的自动化晶体结构重建

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

摘要

Support