ネットワーク上の大規模言語モデル：リソース制約下での協調知能

要旨

大規模言語モデル（LLM）は、スマートフォンアシスタントから自動運転に至るまでのアプリケーションを駆動し、社会を変革しつつある。しかし、クラウドベースのLLMサービスだけでは、断続的な接続性、サブ秒単位のレイテンシ要件、データ所在地制約、あるいは持続的な高負荷推論といった条件下で動作するアプリケーション群を含む、拡大するアプリケーションクラスに応えることはできない。一方、デバイス上での展開は、限られた計算リソースとメモリによって制約される。単一のエンドポイントでは、このような多様な要件にわたって高品質なサービスを提供することは不可能である。本稿では、複数の独立したLLMがデバイスとクラウドエンドポイントに分散配置され、自然言語または構造化メッセージを介してタスクレベルで連携するパラダイムである協調インテリジェンスに焦点を当てる。かかる協調は、計算、メモリ、通信、コストにわたるネットワーク階層間の不均一なリソース制約下で、優れた応答品質を目指すものである。我々は、協調推論を、互いに補完的かつ合成可能な二つの次元、すなわち垂直的なデバイス-クラウド連携と水平的なマルチエージェント連携に分類し、これらは実際にはハイブリッドなトポロジへと組み合わせることができる。次に、連携の学習について検討し、ルーティングポリシーの訓練やLLM間の協調能力の開発に取り組む。最後に、リソースの不均一性下でのスケーリングや信頼性のある協調インテリジェンスなど、未解決の研究課題を明らかにする。

English

Large language models (LLMs) are transforming society, powering applications from smartphone assistants to autonomous driving. Yet cloud-based LLM services alone cannot serve a growing class of applications, including those operating under intermittent connectivity, sub-second latency budgets, data-residency constraints, or sustained high-volume inference. On-device deployment is in turn constrained by limited computation and memory. No single endpoint can deliver high-quality service across this spectrum. This article focuses on collaborative intelligence, a paradigm in which multiple independent LLMs distributed across device and cloud endpoints collaborate at the task level through natural language or structured messages. Such collaboration strives for superior response quality under heterogeneous resource constraints spanning computation, memory, communication, and cost across network tiers. We present collaborative inference along two complementary and composable dimensions: vertical device-cloud collaboration and horizontal multi-agent collaboration, which can be combined into hybrid topologies in practice. We then examine learning to collaborate, addressing the training of routing policies and the development of cooperative capabilities among LLMs. Finally, we identify open research challenges including scaling under resource heterogeneity and trustworthy collaborative intelligence.

ネットワーク上の大規模言語モデル：リソース制約下での協調知能

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

要旨

Support