朝向強健且高效的持續語言學習

摘要

隨著語言模型的應用領域不斷演進，一個自然的問題是我們如何能夠快速地將模型適應新任務。我們從性能學習的角度來探討這個經典問題，我們的目標是持續對過去任務上訓練的模型進行微調，以應用於新任務，並且希望能夠「轉移」相關知識。然而，這種策略也存在著可能帶來更多害處而非好處的風險，即負面轉移。在本文中，我們建立了一個新的任務序列基準，針對可能面臨的不同轉移情境，例如一系列具有正面轉移潛力、負面轉移潛力、無預期效應或混合效應的任務。理想的學習者應該能夠最大程度地利用所有具有正面轉移潛力的任務中的信息，同時避免任何可能混淆它的干擾任務的負面影響。然後，我們提出了一種簡單但有效的學習者，通過從過去任務檢查點中選擇性地初始化新模型的策略，來滿足我們的許多期望。儘管如此，仍存在一些限制，我們希望這個基準可以幫助社群進一步建立和分析這樣的學習者。

English

As the application space of language models continues to evolve, a natural question to ask is how we can quickly adapt models to new tasks. We approach this classic question from a continual learning perspective, in which we aim to continue fine-tuning models trained on past tasks on new tasks, with the goal of "transferring" relevant knowledge. However, this strategy also runs the risk of doing more harm than good, i.e., negative transfer. In this paper, we construct a new benchmark of task sequences that target different possible transfer scenarios one might face, such as a sequence of tasks with high potential of positive transfer, high potential for negative transfer, no expected effect, or a mixture of each. An ideal learner should be able to maximally exploit information from all tasks that have any potential for positive transfer, while also avoiding the negative effects of any distracting tasks that may confuse it. We then propose a simple, yet effective, learner that satisfies many of our desiderata simply by leveraging a selective strategy for initializing new models from past task checkpoints. Still, limitations remain, and we hope this benchmark can help the community to further build and analyze such learners.

朝向強健且高效的持續語言學習

Towards Robust and Efficient Continual Language Learning

摘要

Support