LLM作為人類-計算算法中的工作者?使用LLM複製眾包流程
LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs
July 19, 2023
作者: Tongshuang Wu, Haiyi Zhu, Maya Albayrak, Alexis Axon, Amanda Bertsch, Wenxing Deng, Ziqi Ding, Bill Guo, Sireesh Gururaja, Tzu-Sheng Kuo, Jenny T. Liang, Ryan Liu, Ihita Mandal, Jeremiah Milbauer, Xiaolin Ni, Namrata Padmanabhan, Subhashini Ramkumar, Alexis Sudjianto, Jordan Taylor, Ying-Jui Tseng, Patricia Vaidos, Zhijin Wu, Wei Wu, Chenyang Yang
cs.AI
摘要
最新的語言模型已展現出在之前被認為是人類專屬能力的群眾外包任務中複製人類行為的潛力。然而,目前的努力主要集中在簡單的基本任務上。我們探討語言模型是否能夠複製更複雜的群眾外包流程。我們發現現代語言模型能夠模擬部分群眾工作者在這些「人類計算演算法」中的能力,但成功程度會有所不同,並受到委託者對語言模型能力的理解、子任務所需的具體技能以及執行這些子任務的最佳互動方式的影響。我們反思了人類和語言模型對指示的不同敏感性,強調了為語言模型提供面向人類的安全防護的重要性,並討論了訓練人類和語言模型具有互補技能組合的潛力。重要的是,我們展示了複製群眾外包流程提供了一個寶貴的平台,可用於研究(1)語言模型在不同任務上的相對優勢(通過交叉比較它們在子任務上的表現)和(2)語言模型在複雜任務中的潛力,在這些任務中,它們可以完成部分任務,而將其他任務留給人類。
English
LLMs have shown promise in replicating human-like behavior in crowdsourcing
tasks that were previously thought to be exclusive to human abilities. However,
current efforts focus mainly on simple atomic tasks. We explore whether LLMs
can replicate more complex crowdsourcing pipelines. We find that modern LLMs
can simulate some of crowdworkers' abilities in these "human computation
algorithms," but the level of success is variable and influenced by requesters'
understanding of LLM capabilities, the specific skills required for sub-tasks,
and the optimal interaction modality for performing these sub-tasks. We reflect
on human and LLMs' different sensitivities to instructions, stress the
importance of enabling human-facing safeguards for LLMs, and discuss the
potential of training humans and LLMs with complementary skill sets. Crucially,
we show that replicating crowdsourcing pipelines offers a valuable platform to
investigate (1) the relative strengths of LLMs on different tasks (by
cross-comparing their performances on sub-tasks) and (2) LLMs' potential in
complex tasks, where they can complete part of the tasks while leaving others
to humans.