智能体AI漫游指南：从基础到系统

摘要

《施动者AI漫游指南》是一本面向从业者的构建自主AI系统综合参考手册。本书覆盖从基本原理到生产部署的全技术栈，其核心论点在于：构建卓越的施动者系统需要理解流程的每一层级，而非仅关注单一环节。开篇首先阐述大语言模型基座——涵盖Transformer架构、GPU系统、训练与微调（SFT、LoRA、MoE）、模型压缩及推理优化——将其作为必要基础而非重点对象。随后深入对齐与推理层级：包括基于人类反馈的强化学习（RLHF）、PPO、DPO及其变体、GRPO、奖励建模，以及面向大型推理模型的强化学习技术（涵盖思维链与测试时扩展策略）。后半部分聚焦施动者AI本体，涉及施动者训练与轨迹强化学习、检索增强生成（RAG与施动者RAG）、记忆系统（上下文记忆、外部记忆、情节记忆与语义记忆）、施动者框架设计与上下文管理，以及施动者设计模式分类体系。智能体间协调机制得到详尽探讨：模型上下文协议（MCP）、施动者技能与工具调用、智能体间通信协议（A2A），以及涵盖集中式、分布式与层级拓扑的多智能体架构。全书以施动者开发框架、施动者界面设计、施动者任务评估方法论及生产部署收尾。每章均将严谨理论基础与实现指南、代码示例及原始文献引用相结合。

English

The Hitchhiker's Guide to Agentic AI is a comprehensive practitioner's reference for building autonomous AI systems. The book covers the full stack from first principles to production deployment, organized around a central thesis: building great agentic systems requires understanding every layer of the pipeline, not just one. The book opens with the LLM substrate -- transformer architecture, GPU systems, training and fine-tuning (SFT,LoRA, MoE), model compression, and inference optimization -- treated as essential foundations rather than the primary focus. It then develops the alignment and reasoning layer: reinforcement learning from human feedback (RLHF), PPO, DPO and its variants, GRPO, reward modeling, and RL for large reasoning models including chain-of-thought and test-time scaling. The second half is devoted to agentic AI proper. Topics include agentic training and trajectory-based RL, retrieval-augmented generation (RAG and Agentic RAG), memory systems (in-context, external, episodic, and semantic), agent harness design and context management, and a taxonomy of agent design patterns. Inter-agent coordination is covered in depth: the Model Context Protocol (MCP), agent skills and tool use, the Agent-to-Agent (A2A) communication protocol, and multi-agent architectures spanning centralized, decentralized, and hierarchical topologies. The book concludes with agent development frameworks, agentic UI design, evaluation methodology for agentic tasks, and production deployment. Each chapter pairs rigorous theoretical foundations with implementation guidance, code examples, and references to the primary literature.