ChatPaper.aiChatPaper

開放深度搜索:以開源推理代理實現搜索民主化

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

March 26, 2025
作者: Salaheddin Alzubi, Creston Brooks, Purva Chiniya, Edoardo Contente, Chiara von Gerlach, Lucas Irwin, Yihan Jiang, Arda Kaz, Windsor Nguyen, Sewoong Oh, Himanshu Tyagi, Pramod Viswanath
cs.AI

摘要

我們推出開放深度搜索(Open Deep Search, ODS),旨在縮小專有搜索AI解決方案(如Perplexity的Sonar Reasoning Pro和OpenAI的GPT-4o Search Preview)與其開源對應方案之間日益擴大的差距。ODS的主要創新在於,通過能夠明智使用網絡搜索工具來回答查詢的推理代理,增強了最新開源大型語言模型(LLMs)的推理能力。具體而言,ODS由兩個與用戶選擇的基礎LLM協同工作的組件組成:開放搜索工具(Open Search Tool)和開放推理代理(Open Reasoning Agent)。開放推理代理解釋給定的任務,並通過協調一系列動作(包括調用工具,其中之一便是開放搜索工具)來完成任務。開放搜索工具是一種新穎的網絡搜索工具,其性能超越專有對應方案。結合強大的開源推理LLMs,如DeepSeek-R1,ODS在兩個基準測試(SimpleQA和FRAMES)上幾乎匹配並有時超越現有的最先進基線。例如,在FRAMES評估基準上,ODS將最近發布的GPT-4o Search Preview的最佳現有基線準確率提高了9.7%。ODS是一個通用框架,可無縫增強任何LLMs(例如,在SimpleQA上達到82.4%、在FRAMES上達到30.1%的DeepSeek-R1)的搜索和推理能力,以實現最先進的性能:在SimpleQA上達到88.3%,在FRAMES上達到75.3%。
English
We introduce Open Deep Search (ODS) to close the increasing gap between the proprietary search AI solutions, such as Perplexity's Sonar Reasoning Pro and OpenAI's GPT-4o Search Preview, and their open-source counterparts. The main innovation introduced in ODS is to augment the reasoning capabilities of the latest open-source LLMs with reasoning agents that can judiciously use web search tools to answer queries. Concretely, ODS consists of two components that work with a base LLM chosen by the user: Open Search Tool and Open Reasoning Agent. Open Reasoning Agent interprets the given task and completes it by orchestrating a sequence of actions that includes calling tools, one of which is the Open Search Tool. Open Search Tool is a novel web search tool that outperforms proprietary counterparts. Together with powerful open-source reasoning LLMs, such as DeepSeek-R1, ODS nearly matches and sometimes surpasses the existing state-of-the-art baselines on two benchmarks: SimpleQA and FRAMES. For example, on the FRAMES evaluation benchmark, ODS improves the best existing baseline of the recently released GPT-4o Search Preview by 9.7% in accuracy. ODS is a general framework for seamlessly augmenting any LLMs -- for example, DeepSeek-R1 that achieves 82.4% on SimpleQA and 30.1% on FRAMES -- with search and reasoning capabilities to achieve state-of-the-art performance: 88.3% on SimpleQA and 75.3% on FRAMES.

Summary

AI-Generated Summary

PDF463March 27, 2025