优化检索增强生成:超参数对性能与效率影响的分析
Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency
May 13, 2025
作者: Adel Ammar, Anis Koubaa, Omer Nacar, Wadii Boulila
cs.AI
摘要
大型语言模型虽在任务执行上表现出色,却常出现幻觉或依赖过时知识。检索增强生成(RAG)通过将生成过程与外部搜索相结合,有效弥补了这些不足。本研究深入分析了超参数如何影响RAG系统的速度与质量,涵盖Chroma与Faiss向量存储、分块策略、交叉编码器重排序及温度设置,并评估了六项指标:忠实度、答案正确性、答案相关性、上下文精确度、上下文召回率及答案相似度。Chroma处理查询速度快13%,而Faiss则展现出更高的检索精度,揭示了速度与准确性之间的明显权衡。采用小窗口、最小重叠的固定长度分块策略,不仅超越了语义分割的效果,还保持了最快的处理速度。重排序虽能小幅提升检索质量,却使运行时间增加约五倍,其应用价值因此取决于延迟限制。这些发现为实践者在调优RAG系统时,在计算成本与准确性之间找到平衡提供了指导,旨在实现透明且最新的响应。最后,我们通过校正型RAG工作流重新评估了最优配置,并证明当模型能迭代请求额外证据时,其优势依然显著。我们获得了近乎完美的上下文精确度(99%),这表明RAG系统在恰当的超参数组合下,能够实现极高的检索准确性,这对于检索质量直接影响下游任务性能的应用领域(如医疗保健中的临床决策支持)具有重大意义。
English
Large language models achieve high task performance yet often hallucinate or
rely on outdated knowledge. Retrieval-augmented generation (RAG) addresses
these gaps by coupling generation with external search. We analyse how
hyperparameters influence speed and quality in RAG systems, covering Chroma and
Faiss vector stores, chunking policies, cross-encoder re-ranking, and
temperature, and we evaluate six metrics: faithfulness, answer correctness,
answer relevancy, context precision, context recall, and answer similarity.
Chroma processes queries 13% faster, whereas Faiss yields higher retrieval
precision, revealing a clear speed-accuracy trade-off. Naive fixed-length
chunking with small windows and minimal overlap outperforms semantic
segmentation while remaining the quickest option. Re-ranking provides modest
gains in retrieval quality yet increases runtime by roughly a factor of 5, so
its usefulness depends on latency constraints. These results help practitioners
balance computational cost and accuracy when tuning RAG systems for
transparent, up-to-date responses. Finally, we re-evaluate the top
configurations with a corrective RAG workflow and show that their advantages
persist when the model can iteratively request additional evidence. We obtain a
near-perfect context precision (99%), which demonstrates that RAG systems can
achieve extremely high retrieval accuracy with the right combination of
hyperparameters, with significant implications for applications where retrieval
quality directly impacts downstream task performance, such as clinical decision
support in healthcare.Summary
AI-Generated Summary