ChatPaper.aiChatPaper

為何開源大型語言模型在數據分析上表現欠佳?一項系統性實證研究

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

June 24, 2025
作者: Yuqi Zhu, Yi Zhong, Jintian Zhang, Ziheng Zhang, Shuofei Qiao, Yujie Luo, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang
cs.AI

摘要

大型語言模型(LLMs)在自動化數據分析任務中展現出潛力,然而開源模型在這些推理密集型的場景中面臨顯著限制。本研究探討了提升開源LLMs數據分析能力的策略。通過策劃一個包含多樣化、現實場景的種子數據集,我們從三個維度評估模型:數據理解、代碼生成和策略規劃。我們的分析揭示了三個關鍵發現:(1)策略規劃的質量是模型性能的主要決定因素;(2)交互設計和任務複雜性顯著影響推理能力;(3)數據質量在實現最佳性能方面比多樣性具有更大的影響。我們利用這些見解開發了一種數據合成方法,顯著提升了開源LLMs的分析推理能力。
English
Large Language Models (LLMs) hold promise in automating data analysis tasks, yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios. In this work, we investigate strategies to enhance the data analysis capabilities of open-source LLMs. By curating a seed dataset of diverse, realistic scenarios, we evaluate models across three dimensions: data understanding, code generation, and strategic planning. Our analysis reveals three key findings: (1) Strategic planning quality serves as the primary determinant of model performance; (2) Interaction design and task complexity significantly influence reasoning capabilities; (3) Data quality demonstrates a greater impact than diversity in achieving optimal performance. We leverage these insights to develop a data synthesis methodology, demonstrating significant improvements in open-source LLMs' analytical reasoning capabilities.
PDF81June 25, 2025