一鍵修復所有錯誤。
Proofread: Fixes All Errors with One Tap
June 6, 2024
作者: Renjie Liu, Yanxiang Zhang, Yun Zhu, Haicheng Sun, Yuanbo Zhang, Michael Xuelin Huang, Shanqing Cai, Lei Meng, Shumin Zhai
cs.AI
摘要
大型語言模型(LLMs)具有令人印象深刻的能力,為重新想像用戶的輸入體驗提供了強大的方法。本文展示了Proofread,這是一個由 Gboard 中的伺服器端LLM 提供動力的新穎功能,可通過單次點擊實現無縫的句子級和段落級校正。我們在本文中描述了完整的系統,從數據生成、指標設計到模型調整和部署。為了獲得具有足夠質量的模型,我們實施了一個針對在線使用案例量身定制的謹慎數據合成流程,設計了多面向的指標,採用了兩階段調整方法來獲得專用於該功能的LLM:基礎質量的監督微調(SFT),然後是針對性細化的強化學習(RL)調整方法。具體來說,我們發現在 Rewrite 和 proofread 任務上的順序調整在 SFT 階段產生了最佳質量,並提出了全局和直接獎勵在 RL 調整階段以尋求進一步改進。在人工標記的黃金集上進行的大量實驗顯示,我們調整後的 PaLM2-XS 模型實現了 85.56\% 的優質比例。我們通過在 Google Cloud 上的 TPU v5 上提供模型,將該功能推出到 Pixel 8 設備,每天有數千名活躍用戶。通過量化、桶推斷、文本分割和推測解碼,服務延遲顯著降低。我們的演示可在 https://youtu.be/4ZdcuiwFU7I{Youtube} 中觀看。
English
The impressive capabilities in Large Language Models (LLMs) provide a
powerful approach to reimagine users' typing experience. This paper
demonstrates Proofread, a novel Gboard feature powered by a server-side LLM in
Gboard, enabling seamless sentence-level and paragraph-level corrections with a
single tap. We describe the complete system in this paper, from data
generation, metrics design to model tuning and deployment. To obtain models
with sufficient quality, we implement a careful data synthetic pipeline
tailored to online use cases, design multifaceted metrics, employ a two-stage
tuning approach to acquire the dedicated LLM for the feature: the Supervised
Fine Tuning (SFT) for foundational quality, followed by the Reinforcement
Learning (RL) tuning approach for targeted refinement. Specifically, we find
sequential tuning on Rewrite and proofread tasks yields the best quality in SFT
stage, and propose global and direct rewards in the RL tuning stage to seek
further improvement. Extensive experiments on a human-labeled golden set showed
our tuned PaLM2-XS model achieved 85.56\% good ratio. We launched the feature
to Pixel 8 devices by serving the model on TPU v5 in Google Cloud, with
thousands of daily active users. Serving latency was significantly reduced by
quantization, bucket inference, text segmentation, and speculative decoding.
Our demo could be seen in https://youtu.be/4ZdcuiwFU7I{Youtube}.Summary
AI-Generated Summary