主动学习超参数调研：基于大规模实验网格的深入洞察

摘要

数据标注是一项耗时且成本高昂的任务，但却是监督式机器学习不可或缺的环节。主动学习（Active Learning, AL）作为一种成熟的方法，通过迭代选择最具信息量的未标注样本供专家标注，从而减少人工标注的工作量，并提升整体分类性能。尽管主动学习已存在数十年，但在实际应用中仍鲜见其身影。针对自然语言处理（NLP）领域的两项社区网络调查显示，阻碍实践者采用主动学习的两大主要原因在于：一是配置主动学习的复杂性，二是对其有效性的信任缺失。我们推测，这两大原因背后有着共同的症结：主动学习庞大的超参数空间。这一大多未被深入探索的超参数空间，往往导致实验结果误导性强且难以复现。在本研究中，我们首先构建了一个包含超过460万种超参数组合的大型网格，其次记录了迄今为止最大规模的主动学习研究中所有组合的表现，最后分析了各超参数对实验结果的影响。最终，我们针对每个超参数的影响给出了建议，揭示了具体主动学习策略实施方式带来的惊人影响，并设计了一套以最小计算成本实现可复现主动学习实验的研究方案，为未来开展更具可复现性和可信度的主动学习研究贡献力量。

English

Annotating data is a time-consuming and costly task, but it is inherently required for supervised machine learning. Active Learning (AL) is an established method that minimizes human labeling effort by iteratively selecting the most informative unlabeled samples for expert annotation, thereby improving the overall classification performance. Even though AL has been known for decades, AL is still rarely used in real-world applications. As indicated in the two community web surveys among the NLP community about AL, two main reasons continue to hold practitioners back from using AL: first, the complexity of setting AL up, and second, a lack of trust in its effectiveness. We hypothesize that both reasons share the same culprit: the large hyperparameter space of AL. This mostly unexplored hyperparameter space often leads to misleading and irreproducible AL experiment results. In this study, we first compiled a large hyperparameter grid of over 4.6 million hyperparameter combinations, second, recorded the performance of all combinations in the so-far biggest conducted AL study, and third, analyzed the impact of each hyperparameter in the experiment results. In the end, we give recommendations about the influence of each hyperparameter, demonstrate the surprising influence of the concrete AL strategy implementation, and outline an experimental study design for reproducible AL experiments with minimal computational effort, thus contributing to more reproducible and trustworthy AL research in the future.

主动学习超参数调研：基于大规模实验网格的深入洞察

Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

摘要

Support