集群工作负载分配：基于自然语言处理的语义软亲和性方法

摘要

集群工作负载分配常需复杂配置，导致可用性差距。本文提出一种基于自然语言处理的语义化、意图驱动的集群系统调度范式。该系统通过Kubernetes调度器扩展集成大语言模型，用于解析自然语言分配提示注解以实现软亲和性偏好。开发的原型系统包含集群状态缓存和意图分析器（采用AWS Bedrock平台）。实证评估表明，顶级模型（如Amazon Nova Pro/Premier和Mistral Pixtral Large）在评估基准数据集上实现了高解析准确率（子集准确率>95%），显著优于基线引擎。六种场景下的调度质量测试显示，相较于标准Kubernetes配置，原型系统实现了更优或相当的资源分配效果，尤其在复杂场景、量化场景及冲突软偏好处理方面表现突出。结果验证了LLM在降低调度门槛方面的可行性，但同步LLM延迟等局限性表明需采用异步处理以满足生产环境要求。本研究证实了语义化软亲和性在简化工作负载编排方面的实用价值。

English

Cluster workload allocation often requires complex configurations, creating a usability gap. This paper introduces a semantic, intent-driven scheduling paradigm for cluster systems using Natural Language Processing. The system employs a Large Language Model (LLM) integrated via a Kubernetes scheduler extender to interpret natural language allocation hint annotations for soft affinity preferences. A prototype featuring a cluster state cache and an intent analyzer (using AWS Bedrock) was developed. Empirical evaluation demonstrated high LLM parsing accuracy (>95% Subset Accuracy on an evaluation ground-truth dataset) for top-tier models like Amazon Nova Pro/Premier and Mistral Pixtral Large, significantly outperforming a baseline engine. Scheduling quality tests across six scenarios showed the prototype achieved superior or equivalent placement compared to standard Kubernetes configurations, particularly excelling in complex and quantitative scenarios and handling conflicting soft preferences. The results validate using LLMs for accessible scheduling but highlight limitations like synchronous LLM latency, suggesting asynchronous processing for production readiness. This work confirms the viability of semantic soft affinity for simplifying workload orchestration.

集群工作负载分配：基于自然语言处理的语义软亲和性方法

Cluster Workload Allocation: Semantic Soft Affinity Using Natural Language Processing

摘要

Support