大型語言模型能夠開啟新的科學研究思路嗎?
Can Large Language Models Unlock Novel Scientific Research Ideas?
September 10, 2024
作者: Sandeep Kumar, Tirthankar Ghosal, Vinayak Goyal, Asif Ekbal
cs.AI
摘要
「一個想法不過是舊元素的新組合」(Young, J.W.)。大型語言模型(LLMs)的廣泛應用以及公開提供的ChatGPT標誌著人工智慧(AI)融入人們日常生活的重要轉折點。本研究探討LLMs在生成基於研究論文信息的新穎研究想法方面的能力。我們對五個領域(例如化學、計算機、經濟學、醫學和物理學)中的4個LLMs進行了全面檢查。我們發現Claude-2和GPT-4生成的未來研究想法與作者觀點更一致,而不如GPT-3.5和Gemini。我們還發現,Claude-2生成的未來研究想法比GPT-4、GPT-3.5和Gemini 1.0更多樣化。我們進一步對生成的未來研究想法的新穎性、相關性和可行性進行了人工評估。這項研究提供了有關LLMs在想法生成中不斷演變的角色的見解,突出了其能力和局限性。我們的工作有助於評估和利用語言模型生成未來研究想法的持續努力。我們將我們的數據集和代碼公開提供。
English
"An idea is nothing more nor less than a new combination of old elements"
(Young, J.W.). The widespread adoption of Large Language Models (LLMs) and
publicly available ChatGPT have marked a significant turning point in the
integration of Artificial Intelligence (AI) into people's everyday lives. This
study explores the capability of LLMs in generating novel research ideas based
on information from research papers. We conduct a thorough examination of 4
LLMs in five domains (e.g., Chemistry, Computer, Economics, Medical, and
Physics). We found that the future research ideas generated by Claude-2 and
GPT-4 are more aligned with the author's perspective than GPT-3.5 and Gemini.
We also found that Claude-2 generates more diverse future research ideas than
GPT-4, GPT-3.5, and Gemini 1.0. We further performed a human evaluation of the
novelty, relevancy, and feasibility of the generated future research ideas.
This investigation offers insights into the evolving role of LLMs in idea
generation, highlighting both its capability and limitations. Our work
contributes to the ongoing efforts in evaluating and utilizing language models
for generating future research ideas. We make our datasets and codes publicly
available.Summary
AI-Generated Summary