ShareLM 資料集與外掛程式:為社群貢獻人類-模型對話的利益
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
August 15, 2024
作者: Shachar Don-Yehiya, Leshem Choshen, Omri Abend
cs.AI
摘要
人機對話提供了一扇窗口,讓我們洞察用戶的真實場景、行為和需求,因此對模型開發和研究具有重要價值。盈利公司通過模型的API收集用戶數據,內部使用以改進自身模型,但開源社區和研究界則落後於此。
我們介紹了ShareLM收藏,這是一組與大型語言模型進行的人機對話,以及其附帶的插件,一種網頁擴展,用戶可以自願貢獻用戶-模型對話。在少數平台分享其對話的情況下,ShareLM插件增加了這一功能,從而允許用戶在大多數平台分享對話。該插件允許用戶對其對話進行評分,無論是在對話還是回應級別,並且在離開用戶本地存儲之前刪除他們希望保持私密的對話。我們將插件對話作為ShareLM收藏的一部分發布,呼籲社區在開放人機數據領域進行更多努力。
代碼、插件和數據均可獲得。
English
Human-model conversations provide a window into users' real-world scenarios,
behavior, and needs, and thus are a valuable resource for model development and
research. While for-profit companies collect user data through the APIs of
their models, using it internally to improve their own models, the open source
and research community lags behind.
We introduce the ShareLM collection, a unified set of human conversations
with large language models, and its accompanying plugin, a Web extension for
voluntarily contributing user-model conversations. Where few platforms share
their chats, the ShareLM plugin adds this functionality, thus, allowing users
to share conversations from most platforms. The plugin allows the user to rate
their conversations, both at the conversation and the response levels, and
delete conversations they prefer to keep private before they ever leave the
user's local storage. We release the plugin conversations as part of the
ShareLM collection, and call for more community effort in the field of open
human-model data.
The code, plugin, and data are available.Summary
AI-Generated Summary