TreeHop：高效生成与筛选下一跳查询嵌入，助力多跳问答系统

摘要

检索增强生成（RAG）系统在多跳问答（MHQA）任务中面临显著挑战，其中复杂查询需要跨多个文档片段综合信息。现有方法通常依赖于基于大语言模型（LLM）的迭代式查询重写与路由，导致因重复调用LLM和多阶段处理而产生高计算成本。为应对这些局限，我们提出了TreeHop，一种无需LLM参与查询优化的嵌入级框架。TreeHop通过融合先前查询与检索文档的语义信息，动态更新查询嵌入，仅通过嵌入空间操作即可实现迭代检索。该方法将传统的“检索-重写-向量化-再检索”循环简化为“检索-嵌入-再检索”流程，大幅降低了计算开销。此外，引入基于规则的停止准则以进一步剪枝冗余检索，在效率与召回率之间取得平衡。实验结果表明，TreeHop在三个开放域MHQA数据集上媲美先进的RAG方法，仅以5%-0.4%的模型参数量实现了相当的性能，并将查询延迟较并行方法减少了约99%。这使得TreeHop成为一系列知识密集型应用中更快、更具成本效益的部署方案。为便于复现，代码与数据已公开于：https://github.com/allen-li1231/TreeHop。

English

Retrieval-augmented generation (RAG) systems face significant challenges in multi-hop question answering (MHQA), where complex queries require synthesizing information across multiple document chunks. Existing approaches typically rely on iterative LLM-based query rewriting and routing, resulting in high computational costs due to repeated LLM invocations and multi-stage processes. To address these limitations, we propose TreeHop, an embedding-level framework without the need for LLMs in query refinement. TreeHop dynamically updates query embeddings by fusing semantic information from prior queries and retrieved documents, enabling iterative retrieval through embedding-space operations alone. This method replaces the traditional "Retrieve-Rewrite-Vectorize-Retrieve" cycle with a streamlined "Retrieve-Embed-Retrieve" loop, significantly reducing computational overhead. Moreover, a rule-based stop criterion is introduced to further prune redundant retrievals, balancing efficiency and recall rate. Experimental results show that TreeHop rivals advanced RAG methods across three open-domain MHQA datasets, achieving comparable performance with only 5\%-0.4\% of the model parameter size and reducing the query latency by approximately 99\% compared to concurrent approaches. This makes TreeHop a faster and more cost-effective solution for deployment in a range of knowledge-intensive applications. For reproducibility purposes, codes and data are available here: https://github.com/allen-li1231/TreeHop.

TreeHop：高效生成与筛选下一跳查询嵌入，助力多跳问答系统

TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering

摘要

Support