Knesset-DictaBERT:針對議會議事錄的希伯來語言模型
Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings
July 30, 2024
作者: Gili Goldin, Shuly Wintner
cs.AI
摘要
我們介紹了Knesset-DictaBERT,這是一個在以色列議會議事錄上進行微調的大型希伯來語言模型。該模型基於DictaBERT架構,根據MLM任務在理解議會語言方面展現出顯著的改進。我們對模型性能進行了詳細評估,顯示在困惑度和準確性方面相較於基準DictaBERT模型有所提升。
English
We present Knesset-DictaBERT, a large Hebrew language model fine-tuned on the
Knesset Corpus, which comprises Israeli parliamentary proceedings. The model is
based on the DictaBERT architecture and demonstrates significant improvements
in understanding parliamentary language according to the MLM task. We provide a
detailed evaluation of the model's performance, showing improvements in
perplexity and accuracy over the baseline DictaBERT model.Summary
AI-Generated Summary