ChatPaper.aiChatPaper

大语言模型在工程领域的应用:指导模型设计高性能火箭

LLMs for Engineering: Teaching Models to Design High Powered Rockets

April 27, 2025
作者: Toby Simonds
cs.AI

摘要

大型语言模型(LLMs)已彻底改变了软件工程领域,但其在物理工程领域的应用仍待深入探索。本文通过RocketBench这一将LLMs与高保真火箭模拟相连接的基准测试,评估了LLMs在高功率火箭设计中的能力。我们针对两项复杂度递增的设计任务进行模型测试:目标高度优化与精准着陆挑战。研究发现,尽管最先进的LLMs展现出扎实的基础工程知识,但在接收模拟结果后迭代设计时表现欠佳,最终性能低于人类水平。然而,当结合强化学习(RL)进行增强后,一个拥有70亿参数的模型不仅超越了当前最先进的基础模型,还超越了人类专家。这项研究表明,经过RL训练的LLMs能够成为复杂工程优化的有效工具,有望在软件开发之外的工程领域引发变革。
English
Large Language Models (LLMs) have transformed software engineering, but their application to physical engineering domains remains underexplored. This paper evaluates LLMs' capabilities in high-powered rocketry design through RocketBench, a benchmark connecting LLMs to high-fidelity rocket simulations. We test models on two increasingly complex design tasks: target altitude optimization and precision landing challenges. Our findings reveal that while state-of-the-art LLMs demonstrate strong baseline engineering knowledge, they struggle to iterate on their designs when given simulation results and ultimately plateau below human performance levels. However, when enhanced with reinforcement learning (RL), we show that a 7B parameter model outperforms both SoTA foundation models and human experts. This research demonstrates that RL-trained LLMs can serve as effective tools for complex engineering optimization, potentially transforming engineering domains beyond software development.

Summary

AI-Generated Summary

PDF111May 4, 2025