TreeHop:高效生成与筛选下一跳查询嵌入,助力多跳问答系统
TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering
April 28, 2025
作者: Zhonghao Li, Kunpeng Zhang, Jinghuai Ou, Shuliang Liu, Xuming Hu
cs.AI
摘要
检索增强生成(RAG)系统在多跳问答(MHQA)任务中面临显著挑战,其中复杂查询需要跨多个文档片段综合信息。现有方法通常依赖于基于大语言模型(LLM)的迭代式查询重写与路由,导致因重复调用LLM和多阶段处理而产生高计算成本。为应对这些局限,我们提出了TreeHop,一种无需LLM参与查询优化的嵌入级框架。TreeHop通过融合先前查询与检索文档的语义信息,动态更新查询嵌入,仅通过嵌入空间操作即可实现迭代检索。该方法将传统的“检索-重写-向量化-再检索”循环简化为“检索-嵌入-再检索”流程,大幅降低了计算开销。此外,引入基于规则的停止准则以进一步剪枝冗余检索,在效率与召回率之间取得平衡。实验结果表明,TreeHop在三个开放域MHQA数据集上媲美先进的RAG方法,仅以5%-0.4%的模型参数量实现了相当的性能,并将查询延迟较并行方法减少了约99%。这使得TreeHop成为一系列知识密集型应用中更快、更具成本效益的部署方案。为便于复现,代码与数据已公开于:https://github.com/allen-li1231/TreeHop。
English
Retrieval-augmented generation (RAG) systems face significant challenges in
multi-hop question answering (MHQA), where complex queries require synthesizing
information across multiple document chunks. Existing approaches typically rely
on iterative LLM-based query rewriting and routing, resulting in high
computational costs due to repeated LLM invocations and multi-stage processes.
To address these limitations, we propose TreeHop, an embedding-level framework
without the need for LLMs in query refinement. TreeHop dynamically updates
query embeddings by fusing semantic information from prior queries and
retrieved documents, enabling iterative retrieval through embedding-space
operations alone. This method replaces the traditional
"Retrieve-Rewrite-Vectorize-Retrieve" cycle with a streamlined
"Retrieve-Embed-Retrieve" loop, significantly reducing computational overhead.
Moreover, a rule-based stop criterion is introduced to further prune redundant
retrievals, balancing efficiency and recall rate. Experimental results show
that TreeHop rivals advanced RAG methods across three open-domain MHQA
datasets, achieving comparable performance with only 5\%-0.4\% of the model
parameter size and reducing the query latency by approximately 99\% compared to
concurrent approaches. This makes TreeHop a faster and more cost-effective
solution for deployment in a range of knowledge-intensive applications. For
reproducibility purposes, codes and data are available here:
https://github.com/allen-li1231/TreeHop.Summary
AI-Generated Summary