ChatPaper.aiChatPaper

PokéLLMon:一個使用大型語言模型進行寶可夢對戰的人類水準代理程序

PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

February 2, 2024
作者: Sihao Hu, Tiansheng Huang, Ling Liu
cs.AI

摘要

我們介紹了Pok\'eLLMon,這是首個以LLM實體化的代理人,在戰術戰鬥遊戲中實現了與人類相當的表現,如在Pok\'emon戰鬥中展示的。Pok\'eLLMon的設計包括三個關鍵策略:(i) 在情境中的強化學習,即時利用從戰鬥中獲得的基於文本的反饋來迭代地優化策略;(ii) 知識增強生成,檢索外部知識以對抗幻覺,使代理人能夠及時適當地行動;(iii) 一致的行動生成,以減輕代理人面對強大對手並希望逃避戰鬥時的恐慌切換現象。我們展示了與人類的線上戰鬥,證明了Pok\'eLLMon的人類般的戰鬥策略和及時決策,其在階梯比賽中獲勝率達到49\%,在邀請戰鬥中獲勝率達到56\%。我們的實現和可玩戰鬥日誌可在以下網址找到:https://github.com/git-disl/PokeLLMon。
English
We introduce Pok\'eLLMon, the first LLM-embodied agent that achieves human-parity performance in tactical battle games, as demonstrated in Pok\'emon battles. The design of Pok\'eLLMon incorporates three key strategies: (i) In-context reinforcement learning that instantly consumes text-based feedback derived from battles to iteratively refine the policy; (ii) Knowledge-augmented generation that retrieves external knowledge to counteract hallucination and enables the agent to act timely and properly; (iii) Consistent action generation to mitigate the panic switching phenomenon when the agent faces a powerful opponent and wants to elude the battle. We show that online battles against human demonstrates Pok\'eLLMon's human-like battle strategies and just-in-time decision making, achieving 49\% of win rate in the Ladder competitions and 56\% of win rate in the invited battles. Our implementation and playable battle logs are available at: https://github.com/git-disl/PokeLLMon.
PDF323December 15, 2024