Verbetering van Sentimentclassificatie en Ironiedetectie in Grote Taalmodellen door Geavanceerde Prompt Engineering-technieken

Samenvatting

Dit onderzoek bestudeert het gebruik van promptengineering om grote taalmodellen (LLM's), specifiek GPT-4o-mini en gemini-1.5-flash, te verbeteren bij sentimentanalysetaken. Het evalueert geavanceerde promptingtechnieken zoals few-shot learning, chain-of-thought prompting en self-consistency tegen een baseline. Belangrijke taken omvatten sentimentclassificatie, aspectgebaseerde sentimentanalyse en het detecteren van subtiele nuances zoals ironie. Het onderzoek beschrijft de theoretische achtergrond, datasets en gebruikte methoden, en beoordeelt de prestaties van de LLM's gemeten aan de hand van nauwkeurigheid, recall, precisie en F1-score. De bevindingen tonen aan dat geavanceerde prompting de sentimentanalyse aanzienlijk verbetert, waarbij de few-shot-benadering uitblinkt in GPT-4o-mini en chain-of-thought prompting de ironiedetectie in gemini-1.5-flash met tot 46% verhoogt. Hoewel geavanceerde promptingtechnieken de prestaties over het algemeen verbeteren, suggereert het feit dat few-shot prompting het beste werkt voor GPT-4o-mini en chain-of-thought uitblinkt in gemini-1.5-flash voor ironiedetectie dat promptingstrategieën moeten worden afgestemd op zowel het model als de taak. Dit benadrukt het belang van het afstemmen van promptontwerp op zowel de architectuur van het LLM als de semantische complexiteit van de taak.

English

This study investigates the use of prompt engineering to enhance large language models (LLMs), specifically GPT-4o-mini and gemini-1.5-flash, in sentiment analysis tasks. It evaluates advanced prompting techniques like few-shot learning, chain-of-thought prompting, and self-consistency against a baseline. Key tasks include sentiment classification, aspect-based sentiment analysis, and detecting subtle nuances such as irony. The research details the theoretical background, datasets, and methods used, assessing performance of LLMs as measured by accuracy, recall, precision, and F1 score. Findings reveal that advanced prompting significantly improves sentiment analysis, with the few-shot approach excelling in GPT-4o-mini and chain-of-thought prompting boosting irony detection in gemini-1.5-flash by up to 46%. Thus, while advanced prompting techniques overall improve performance, the fact that few-shot prompting works best for GPT-4o-mini and chain-of-thought excels in gemini-1.5-flash for irony detection suggests that prompting strategies must be tailored to both the model and the task. This highlights the importance of aligning prompt design with both the LLM's architecture and the semantic complexity of the task.

Verbetering van Sentimentclassificatie en Ironiedetectie in Grote Taalmodellen door Geavanceerde Prompt Engineering-technieken

Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques

Samenvatting

Support