반자율적 블라소프-맥스웰-란다우 평형 형식화

초록

우리는 하전 플라즈마의 운동을 설명하는 Vlasov-Maxwell-Landau(VML) 시스템의 평형 상태 특성화에 대한 완전한 Lean 4 형식화를 제시한다. 이 프로젝트는 AI 기반 수학 연구의 전체 순환 과정을 보여준다: AI 추론 모델(Gemini DeepThink)이 추측에서 증명을 생성하고, 에이전트 코딩 도구(Claude Code)가 자연어 프롬프트에서 Lean 코드로 변환하였으며, 특화된 증명기(Aristotle)가 111개의 보조정리를 해결하고, Lean 커널이 결과를 검증하였다. 단 한 명의 수학자가 10일 동안 과정을 감독했으며 비용은 200달러였고 코드는 한 줄도 작성하지 않았다. 전체 개발 과정은 공개되어 있다: 모든 229개의 인간 프롬프트와 213개의 git 커밋이 저장소에 보관된다. 우리는 AI 실패 모드(가설 범람, 정의 정렬 버그, 에이전트 회피 행동)와 효과적인 방법(추상/구체적 증명 분할, 적대적 자가 검토, 핵심 정의 및 정리 문장에 대한 인간 검토의 중요성)에 대한 상세한 교훈을 보고한다. 특히, 해당 수학 논문의 최종 초안이 완성되기 전에 형식화가 완료되었다는 점이 주목할 만하다.

English

We present a complete Lean 4 formalization of the equilibrium characterization in the Vlasov-Maxwell-Landau (VML) system, which describes the motion of charged plasma. The project demonstrates the full AI-assisted mathematical research loop: an AI reasoning model (Gemini DeepThink) generated the proof from a conjecture, an agentic coding tool (Claude Code) translated it into Lean from natural-language prompts, a specialized prover (Aristotle) closed 111 lemmas, and the Lean kernel verified the result. A single mathematician supervised the process over 10 days at a cost of \$200, writing zero lines of code. The entire development process is public: all 229 human prompts, and 213 git commits are archived in the repository. We report detailed lessons on AI failure modes -- hypothesis creep, definition-alignment bugs, agent avoidance behaviors -- and on what worked: the abstract/concrete proof split, adversarial self-review, and the critical role of human review of key definitions and theorem statements. Notably, the formalization was completed before the final draft of the corresponding math paper was finished.

반자율적 블라소프-맥스웰-란다우 평형 형식화

Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium

초록

Support