StructLM:朝向構建結構化知識的通用模型之路
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
February 26, 2024
作者: Alex Zhuang, Ge Zhang, Tianyu Zheng, Xinrun Du, Junjie Wang, Weiming Ren, Stephen W. Huang, Jie Fu, Xiang Yue, Wenhu Chen
cs.AI
摘要
結構化數據來源,如表格、圖形和數據庫,是普遍的知識來源。儘管大型語言模型(LLMs)在純文本上表現出色,但它們在解釋和利用結構化數據方面的能力仍然有限。我們的研究揭示了LLMs處理結構化數據的能力存在明顯不足,例如,ChatGPT在平均落後於最先進模型(SoTA)35%。為了增強LLMs中的結構化知識基礎(SKG)能力,我們開發了一個包含110萬個示例的全面指導調整數據集。利用這個數據集,我們訓練了一系列模型,稱為StructLM,基於Code-LLaMA架構,參數範圍從7B到34B。我們的StructLM系列在18個評估數據集中的14個上超越了特定任務模型,並在7個SKG任務上建立了新的SoTA成就。此外,StructLM展示了對6個新的SKG任務的卓越泛化能力。與預期相反,我們觀察到增加模型大小僅帶來輕微的好處,StructLM-34B僅略微優於StructLM-7B。這表明結構化知識基礎仍然是一項具有挑戰性的任務,需要更多創新的設計來推向新的水平。
English
Structured data sources, such as tables, graphs, and databases, are
ubiquitous knowledge sources. Despite the demonstrated capabilities of large
language models (LLMs) on plain text, their proficiency in interpreting and
utilizing structured data remains limited. Our investigation reveals a notable
deficiency in LLMs' ability to process structured data, e.g., ChatGPT lags
behind state-of-the-art (SoTA) model by an average of 35%. To augment the
Structured Knowledge Grounding (SKG) capabilities in LLMs, we have developed a
comprehensive instruction tuning dataset comprising 1.1 million examples.
Utilizing this dataset, we train a series of models, referred to as StructLM,
based on the Code-LLaMA architecture, ranging from 7B to 34B parameters. Our
StructLM series surpasses task-specific models on 14 out of 18 evaluated
datasets and establishes new SoTA achievements on 7 SKG tasks. Furthermore,
StructLM demonstrates exceptional generalization across 6 novel SKG tasks.
Contrary to expectations, we observe that scaling model size offers marginal
benefits, with StructLM-34B showing only slight improvements over StructLM-7B.
This suggests that structured knowledge grounding is still a challenging task
and requires more innovative design to push to a new level.