Pearl：一個可投入生產的強化學習智能體

摘要

強化學習（RL）提供了一個多功能框架，用於實現長期目標。其通用性使我們能夠形式化一系列現實世界智能系統遇到的問題，例如處理延遲獎勵、處理部分可觀察性、應對探索和利用困境、利用離線數據來改善在線性能，以及確保滿足安全限制。儘管強化學習研究界在解決這些問題方面取得了相當大的進展，但現有的開源強化學習庫往往專注於強化學習解決方案流程的一個狹窄部分，而其他方面則大多被忽視。本文介紹了 Pearl，一個可供生產使用的強化學習代理軟件包，明確設計為以模塊化方式應對這些挑戰。除了介紹初步的基準結果外，本文還強調了 Pearl 在工業中的應用，以展示其在生產中的可用性。Pearl 在 Github 上以開源方式提供，網址為 github.com/facebookresearch/pearl，官方網站位於 pearlagent.github.io。

English

Reinforcement Learning (RL) offers a versatile framework for achieving long-term goals. Its generality allows us to formalize a wide range of problems that real-world intelligent systems encounter, such as dealing with delayed rewards, handling partial observability, addressing the exploration and exploitation dilemma, utilizing offline data to improve online performance, and ensuring safety constraints are met. Despite considerable progress made by the RL research community in addressing these issues, existing open-source RL libraries tend to focus on a narrow portion of the RL solution pipeline, leaving other aspects largely unattended. This paper introduces Pearl, a Production-ready RL agent software package explicitly designed to embrace these challenges in a modular fashion. In addition to presenting preliminary benchmark results, this paper highlights Pearl's industry adoptions to demonstrate its readiness for production usage. Pearl is open sourced on Github at github.com/facebookresearch/pearl and its official website is located at pearlagent.github.io.

Pearl：一個可投入生產的強化學習智能體

Pearl: A Production-ready Reinforcement Learning Agent

摘要

Support