LLM作為人類-計算算法中的工作者？使用LLM複製眾包流程

摘要

最新的語言模型已展現出在之前被認為是人類專屬能力的群眾外包任務中複製人類行為的潛力。然而，目前的努力主要集中在簡單的基本任務上。我們探討語言模型是否能夠複製更複雜的群眾外包流程。我們發現現代語言模型能夠模擬部分群眾工作者在這些「人類計算演算法」中的能力，但成功程度會有所不同，並受到委託者對語言模型能力的理解、子任務所需的具體技能以及執行這些子任務的最佳互動方式的影響。我們反思了人類和語言模型對指示的不同敏感性，強調了為語言模型提供面向人類的安全防護的重要性，並討論了訓練人類和語言模型具有互補技能組合的潛力。重要的是，我們展示了複製群眾外包流程提供了一個寶貴的平台，可用於研究（1）語言模型在不同任務上的相對優勢（通過交叉比較它們在子任務上的表現）和（2）語言模型在複雜任務中的潛力，在這些任務中，它們可以完成部分任務，而將其他任務留給人類。

English

LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate (1) the relative strengths of LLMs on different tasks (by cross-comparing their performances on sub-tasks) and (2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans.

LLM作為人類-計算算法中的工作者？使用LLM複製眾包流程

LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

摘要

Support