MiroThinker-1.7与H1：基于验证机制的重型研究智能体探索

摘要

我们推出MiroThinker-1.7，这是一款专为复杂长程推理任务设计的新型研究智能体。基于此基础，我们进一步推出MiroThinker-H1，通过增强重型推理能力扩展了智能体功能，实现更可靠的多步骤问题求解。特别值得一提的是，MiroThinker-1.7通过强调结构化规划、情境推理与工具交互的智能体中期训练阶段，提升了每个交互步骤的可靠性。这使得智能体能在复杂任务中实现更有效的多步交互与持续推理。MiroThinker-H1更进一步将验证机制直接融入推理过程，涵盖局部与全局层面：在推理过程中可评估并优化中间决策，同时审计整体推理轨迹以确保最终结论由连贯的证据链支撑。在涵盖开放网络研究、科学推理与金融分析的基准测试中，MiroThinker-H1在深度研究任务上达到最先进性能，同时在专业领域保持强劲表现。我们还开源发布了MiroThinker-1.7与MiroThinker-1.7-mini模型，以显著提升的效率提供具备竞争力的研究智能体能力。

English

We present MiroThinker-1.7, a new research agent designed for complex long-horizon reasoning tasks. Building on this foundation, we further introduce MiroThinker-H1, which extends the agent with heavy-duty reasoning capabilities for more reliable multi-step problem solving. In particular, MiroThinker-1.7 improves the reliability of each interaction step through an agentic mid-training stage that emphasizes structured planning, contextual reasoning, and tool interaction. This enables more effective multi-step interaction and sustained reasoning across complex tasks. MiroThinker-H1 further incorporates verification directly into the reasoning process at both local and global levels. Intermediate reasoning decisions can be evaluated and refined during inference, while the overall reasoning trajectory is audited to ensure that final answers are supported by coherent chains of evidence. Across benchmarks covering open-web research, scientific reasoning, and financial analysis, MiroThinker-H1 achieves state-of-the-art performance on deep research tasks while maintaining strong results on specialized domains. We also release MiroThinker-1.7 and MiroThinker-1.7-mini as open-source models, providing competitive research-agent capabilities with significantly improved efficiency.