MiroThinker-1.7 & H1: 検証による重厚な研究エージェントの実現に向けて

要旨

本論文では、複雑な長期推論タスク向けに設計された新しい研究エージェント「MiroThinker-1.7」を提案する。この基盤をさらに発展させ、信頼性の高い多段階問題解決を実現する重厚な推論能力を備えた「MiroThinker-H1」を導入する。特にMiroThinker-1.7は、構造化計画・文脈推論・ツール連携を重視したエージェント型中間訓練段階を通じて、各インタラクション段階の信頼性を向上させる。これにより、複雑なタスクにおける効果的な多段階インタラクションと持続的推論が可能となる。MiroThinker-H1はさらに、局所レベルと大域レベル双方で検証機能を推論プロセスに直接統合する。推論途中の判断を推論実行時に評価・修正できる一方、全体の推論軌道を監査し、最終回答が一貫性のある証拠連鎖で支持されることを保証する。オープンウェブ調査・科学的推論・金融分析を含むベンチマークにおいて、MiroThinker-H1は深層研究タスクで最先端の性能を達成しつつ、専門領域でも強固な結果を維持する。また、競争力のある研究エージェント機能を効率性大幅向上で提供するオープンソースモデルとして、MiroThinker-1.7およびMiroThinker-1.7-miniを公開する。

English

We present MiroThinker-1.7, a new research agent designed for complex long-horizon reasoning tasks. Building on this foundation, we further introduce MiroThinker-H1, which extends the agent with heavy-duty reasoning capabilities for more reliable multi-step problem solving. In particular, MiroThinker-1.7 improves the reliability of each interaction step through an agentic mid-training stage that emphasizes structured planning, contextual reasoning, and tool interaction. This enables more effective multi-step interaction and sustained reasoning across complex tasks. MiroThinker-H1 further incorporates verification directly into the reasoning process at both local and global levels. Intermediate reasoning decisions can be evaluated and refined during inference, while the overall reasoning trajectory is audited to ensure that final answers are supported by coherent chains of evidence. Across benchmarks covering open-web research, scientific reasoning, and financial analysis, MiroThinker-H1 achieves state-of-the-art performance on deep research tasks while maintaining strong results on specialized domains. We also release MiroThinker-1.7 and MiroThinker-1.7-mini as open-source models, providing competitive research-agent capabilities with significantly improved efficiency.

MiroThinker-1.7 & H1: 検証による重厚な研究エージェントの実現に向けて

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

要旨

Support