ChatPaper.aiChatPaper

大型语言模型能否开启新颖的科学研究思路?

Can Large Language Models Unlock Novel Scientific Research Ideas?

September 10, 2024
作者: Sandeep Kumar, Tirthankar Ghosal, Vinayak Goyal, Asif Ekbal
cs.AI

摘要

“一个想法无非是旧元素的新组合”(Young, J.W.)。大型语言模型(LLMs)的广泛应用和公开可用的ChatGPT标志着人工智能(AI)融入人们日常生活的重要转折点。本研究探讨了LLMs在基于研究论文信息生成新颖研究想法方面的能力。我们对化学、计算机、经济学、医学和物理等五个领域中的4个LLMs进行了彻底检查。我们发现,Claude-2和GPT-4生成的未来研究想法与作者观点更为一致,而GPT-3.5和Gemini则相对不太一致。此外,我们发现Claude-2生成的未来研究想法比GPT-4、GPT-3.5和Gemini 1.0更为多样化。我们进一步对生成的未来研究想法的新颖性、相关性和可行性进行了人类评估。这项研究揭示了LLMs在想法生成中不断演变的作用,突显了其能力和局限性。我们的工作有助于评估和利用语言模型生成未来研究想法的持续努力。我们公开提供我们的数据集和代码。
English
"An idea is nothing more nor less than a new combination of old elements" (Young, J.W.). The widespread adoption of Large Language Models (LLMs) and publicly available ChatGPT have marked a significant turning point in the integration of Artificial Intelligence (AI) into people's everyday lives. This study explores the capability of LLMs in generating novel research ideas based on information from research papers. We conduct a thorough examination of 4 LLMs in five domains (e.g., Chemistry, Computer, Economics, Medical, and Physics). We found that the future research ideas generated by Claude-2 and GPT-4 are more aligned with the author's perspective than GPT-3.5 and Gemini. We also found that Claude-2 generates more diverse future research ideas than GPT-4, GPT-3.5, and Gemini 1.0. We further performed a human evaluation of the novelty, relevancy, and feasibility of the generated future research ideas. This investigation offers insights into the evolving role of LLMs in idea generation, highlighting both its capability and limitations. Our work contributes to the ongoing efforts in evaluating and utilizing language models for generating future research ideas. We make our datasets and codes publicly available.

Summary

AI-Generated Summary

PDF148November 16, 2024