集群工作负载分配：基于自然语言处理的语义软亲和性策略

摘要

集群工作负载分配常需复杂配置，存在可用性鸿沟。本文提出一种基于自然语言处理的语义化、意图驱动式集群系统调度范式。该系统通过Kubernetes调度器扩展集成大语言模型，用于解析自然语言分配提示注解中的软亲和性偏好。开发的原型系统具备集群状态缓存和意图分析器（采用AWS Bedrock服务），实证评估表明顶级模型（如Amazon Nova Pro/Premier和Mistral Pixtral Large）的LLM解析准确率极高（在评估基准数据集上子集准确率>95%），显著优于基线引擎。六种场景下的调度质量测试显示，相较于标准Kubernetes配置，原型系统实现了更优或相当的资源放置效果，尤其在复杂场景、定量化场景及冲突软偏好处理方面表现突出。结果验证了LLM在降低调度门槛方面的有效性，但揭示了同步LLM延迟等局限，建议采用异步处理以满足生产就绪需求。本研究证实了语义化软亲和性在简化工作负载编排方面的可行性。

English

Cluster workload allocation often requires complex configurations, creating a usability gap. This paper introduces a semantic, intent-driven scheduling paradigm for cluster systems using Natural Language Processing. The system employs a Large Language Model (LLM) integrated via a Kubernetes scheduler extender to interpret natural language allocation hint annotations for soft affinity preferences. A prototype featuring a cluster state cache and an intent analyzer (using AWS Bedrock) was developed. Empirical evaluation demonstrated high LLM parsing accuracy (>95% Subset Accuracy on an evaluation ground-truth dataset) for top-tier models like Amazon Nova Pro/Premier and Mistral Pixtral Large, significantly outperforming a baseline engine. Scheduling quality tests across six scenarios showed the prototype achieved superior or equivalent placement compared to standard Kubernetes configurations, particularly excelling in complex and quantitative scenarios and handling conflicting soft preferences. The results validate using LLMs for accessible scheduling but highlight limitations like synchronous LLM latency, suggesting asynchronous processing for production readiness. This work confirms the viability of semantic soft affinity for simplifying workload orchestration.

集群工作负载分配：基于自然语言处理的语义软亲和性策略

Cluster Workload Allocation: Semantic Soft Affinity Using Natural Language Processing

摘要

Support