优化检索增强生成：超参数对性能与效率影响的分析

摘要

大型语言模型虽在任务执行上表现出色，却常出现幻觉或依赖过时知识。检索增强生成（RAG）通过将生成过程与外部搜索相结合，有效弥补了这些不足。本研究深入分析了超参数如何影响RAG系统的速度与质量，涵盖Chroma与Faiss向量存储、分块策略、交叉编码器重排序及温度设置，并评估了六项指标：忠实度、答案正确性、答案相关性、上下文精确度、上下文召回率及答案相似度。Chroma处理查询速度快13%，而Faiss则展现出更高的检索精度，揭示了速度与准确性之间的明显权衡。采用小窗口、最小重叠的固定长度分块策略，不仅超越了语义分割的效果，还保持了最快的处理速度。重排序虽能小幅提升检索质量，却使运行时间增加约五倍，其应用价值因此取决于延迟限制。这些发现为实践者在调优RAG系统时，在计算成本与准确性之间找到平衡提供了指导，旨在实现透明且最新的响应。最后，我们通过校正型RAG工作流重新评估了最优配置，并证明当模型能迭代请求额外证据时，其优势依然显著。我们获得了近乎完美的上下文精确度（99%），这表明RAG系统在恰当的超参数组合下，能够实现极高的检索准确性，这对于检索质量直接影响下游任务性能的应用领域（如医疗保健中的临床决策支持）具有重大意义。

English

Large language models achieve high task performance yet often hallucinate or rely on outdated knowledge. Retrieval-augmented generation (RAG) addresses these gaps by coupling generation with external search. We analyse how hyperparameters influence speed and quality in RAG systems, covering Chroma and Faiss vector stores, chunking policies, cross-encoder re-ranking, and temperature, and we evaluate six metrics: faithfulness, answer correctness, answer relevancy, context precision, context recall, and answer similarity. Chroma processes queries 13% faster, whereas Faiss yields higher retrieval precision, revealing a clear speed-accuracy trade-off. Naive fixed-length chunking with small windows and minimal overlap outperforms semantic segmentation while remaining the quickest option. Re-ranking provides modest gains in retrieval quality yet increases runtime by roughly a factor of 5, so its usefulness depends on latency constraints. These results help practitioners balance computational cost and accuracy when tuning RAG systems for transparent, up-to-date responses. Finally, we re-evaluate the top configurations with a corrective RAG workflow and show that their advantages persist when the model can iteratively request additional evidence. We obtain a near-perfect context precision (99%), which demonstrates that RAG systems can achieve extremely high retrieval accuracy with the right combination of hyperparameters, with significant implications for applications where retrieval quality directly impacts downstream task performance, such as clinical decision support in healthcare.

优化检索增强生成：超参数对性能与效率影响的分析

Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency

摘要

Support