八爪魚規劃器：用於計劃者-動作代理的設備端語言模型

摘要

人工智慧代理在各個領域變得越來越重要，使得自主決策和問題解決成為可能。為了有效運作，這些代理需要一個規劃過程，該過程確定最佳行動方案，然後執行計劃中的行動。本文提出了一個高效的裝置內規劃-行動框架，將規劃和行動執行分為兩個不同的組件：基於 Phi-3 Mini 的規劃代理，這是一個針對邊緣裝置優化的擁有 38 億參數的 LLM 模型，以及使用 Octopus 模型進行功能執行的行動代理。規劃代理首先通過將任務分解為一系列子步驟來回應用戶查詢，然後由行動代理執行這些子步驟。為了在資源受限的裝置上優化性能，我們採用模型微調而非上下文學習，從而降低計算成本和能源消耗，同時提高響應時間。我們的方法包括使用 GPT-4 根據可用功能生成多樣化的規劃查詢和回應，然後進行後續驗證以確保數據質量。我們在這個經過精心策劃的數據集上對 Phi-3 Mini 模型進行微調，並在我們的領域測試環境中實現了 97% 的成功率。為應對多領域規劃挑戰，我們開發了一種多-LoRA 訓練方法，將在不同功能子集上訓練的 LoRA 的權重合併。這種方法實現了對複雜的多領域查詢的靈活處理，同時在資源受限的裝置上保持計算效率。為了支持進一步的研究，我們已在 https://huggingface.co/NexaAIDev/octopus-planning 上開源我們的模型權重。有關演示，請參閱 https://www.nexa4ai.com/octo-planner。

English

AI agents have become increasingly significant in various domains, enabling autonomous decision-making and problem-solving. To function effectively, these agents require a planning process that determines the best course of action and then executes the planned actions. In this paper, we present an efficient on-device Planner-Action framework that separates planning and action execution into two distinct components: a planner agent based on Phi-3 Mini, a 3.8 billion parameter LLM optimized for edge devices, and an action agent using the Octopus model for function execution. The planner agent first responds to user queries by decomposing tasks into a sequence of sub-steps, which are then executed by the action agent. To optimize performance on resource-constrained devices, we employ model fine-tuning instead of in-context learning, reducing computational costs and energy consumption while improving response times. Our approach involves using GPT-4 to generate diverse planning queries and responses based on available functions, with subsequent validations to ensure data quality. We fine-tune the Phi-3 Mini model on this curated dataset, achieving a 97\% success rate in our in-domain test environment. To address multi-domain planning challenges, we developed a multi-LoRA training method that merges weights from LoRAs trained on distinct function subsets. This approach enables flexible handling of complex, multi-domain queries while maintaining computational efficiency on resource-constrained devices. To support further research, we have open-sourced our model weights at https://huggingface.co/NexaAIDev/octopus-planning. For the demo, please refer to https://www.nexa4ai.com/octo-planner.

八爪魚規劃器：用於計劃者-動作代理的設備端語言模型

Octo-planner: On-device Language Model for Planner-Action Agents

摘要

Support