撰寫輔助的智能文字建議
Smart Word Suggestions for Writing Assistance
May 17, 2023
作者: Chenshuo Wang, Shaoguang Mao, Tao Ge, Wenshan Wu, Xun Wang, Yan Xia, Jonathan Tien, Dongyan Zhao
cs.AI
摘要
增強詞語使用是寫作輔助中一個理想的功能。為了進一步推進這一領域的研究,本文介紹了「智能詞語建議」(SWS)任務和基準。與其他作品不同,SWS強調端到端評估,呈現了更現實的寫作輔助情境。該任務涉及識別需要改進的詞語或短語,並提供替換建議。基準包括人工標記的測試數據,用於訓練的大型遠程監督數據集,以及評估框架。測試數據包括由英語學習者撰寫的1,000個句子,附帶由10名母語者標註的超過16,000個替換建議。訓練數據集包括超過3.7百萬個句子和通過規則生成的12.7百萬個建議。我們對七個基準模型進行的實驗表明,SWS是一個具有挑戰性的任務。根據實驗分析,我們提出了未來在SWS上進行研究的潛在方向。數據集和相關代碼可在https://github.com/microsoft/SmartWordSuggestions找到。
English
Enhancing word usage is a desired feature for writing assistance. To further
advance research in this area, this paper introduces "Smart Word Suggestions"
(SWS) task and benchmark. Unlike other works, SWS emphasizes end-to-end
evaluation and presents a more realistic writing assistance scenario. This task
involves identifying words or phrases that require improvement and providing
substitution suggestions. The benchmark includes human-labeled data for
testing, a large distantly supervised dataset for training, and the framework
for evaluation. The test data includes 1,000 sentences written by English
learners, accompanied by over 16,000 substitution suggestions annotated by 10
native speakers. The training dataset comprises over 3.7 million sentences and
12.7 million suggestions generated through rules. Our experiments with seven
baselines demonstrate that SWS is a challenging task. Based on experimental
analysis, we suggest potential directions for future research on SWS. The
dataset and related codes is available at
https://github.com/microsoft/SmartWordSuggestions.