ChatPaper.aiChatPaper

基於注意力機制的相關性評分增強阿拉伯文本檢索

Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

July 31, 2025
作者: Salah Eddine Bekhouche, Azeddine Benlamoudi, Yazid Bounab, Fadi Dornaika, Abdenour Hadid
cs.AI

摘要

阿拉伯語因其複雜的形態學、可選的變音符號以及現代標準阿拉伯語(MSA)與多種方言並存,對自然語言處理(NLP)和資訊檢索(IR)構成了特殊挑戰。儘管阿拉伯語在全球的重要性日益增長,但在NLP研究和基準資源中仍顯不足。本文提出了一種專為阿拉伯語設計的增強型密集段落檢索(DPR)框架。我們方法的核心是一種新穎的注意力相關性評分(ARS),它取代了標準的交互機制,採用了一種自適應評分函數,更有效地模擬了問題與段落之間的語義相關性。我們的方法整合了預訓練的阿拉伯語語言模型和架構改進,以提高檢索性能,並在回答阿拉伯語問題時顯著提升排名準確性。相關代碼已公開於https://github.com/Bekhouche/APR{GitHub}。
English
Arabic poses a particular challenge for natural language processing (NLP) and information retrieval (IR) due to its complex morphology, optional diacritics and the coexistence of Modern Standard Arabic (MSA) and various dialects. Despite the growing global significance of Arabic, it is still underrepresented in NLP research and benchmark resources. In this paper, we present an enhanced Dense Passage Retrieval (DPR) framework developed specifically for Arabic. At the core of our approach is a novel Attentive Relevance Scoring (ARS) that replaces standard interaction mechanisms with an adaptive scoring function that more effectively models the semantic relevance between questions and passages. Our method integrates pre-trained Arabic language models and architectural refinements to improve retrieval performance and significantly increase ranking accuracy when answering Arabic questions. The code is made publicly available at https://github.com/Bekhouche/APR{GitHub}.
PDF12August 1, 2025