LLM 개인화 교육하기 - 글쓰기 교육에서 영감을 받은 접근법

초록

개인화된 텍스트 생성은 최근 많은 관심을 받고 있는 신흥 연구 분야입니다. 이 방향의 대부분의 연구는 특정 도메인에 초점을 맞추어 맞춤형 기능이나 모델을 설계하는 데 주력해 왔습니다. 본 연구에서는 대규모 언어 모델(LLM)을 활용한 개인화된 텍스트 생성을 위한 일반적인 접근 방식을 제안합니다. 글쓰기 교육의 실제 사례에서 영감을 받아, LLM을 개인화된 생성을 위해 가르치는 다단계 및 다중 작업 프레임워크를 개발했습니다. 글쓰기 교육에서 출처를 바탕으로 글을 쓰는 작업은 종종 정보를 찾고, 평가하고, 요약하고, 종합하고, 통합하는 여러 단계로 분해됩니다. 이와 유사하게, 우리의 개인화된 텍스트 생성 접근 방식은 검색, 순위 지정, 요약, 종합, 생성의 여러 단계로 구성됩니다. 또한, 교육에서 학생의 읽기 능력과 글쓰기 능력이 종종 상관관계가 있다는 관찰에서 영감을 받아, 모델의 생성 능력을 더욱 향상시키는 다중 작업 설정을 도입했습니다. 우리는 이 접근 방식을 서로 다른 대표적인 도메인을 다루는 세 가지 공개 데이터셋에서 평가했습니다. 그 결과, 다양한 베이스라인 대비 상당한 개선을 확인할 수 있었습니다.

English

Personalized text generation is an emerging research area that has attracted much attention in recent years. Most studies in this direction focus on a particular domain by designing bespoke features or models. In this work, we propose a general approach for personalized text generation using large language models (LLMs). Inspired by the practice of writing education, we develop a multistage and multitask framework to teach LLMs for personalized generation. In writing instruction, the task of writing from sources is often decomposed into multiple steps that involve finding, evaluating, summarizing, synthesizing, and integrating information. Analogously, our approach to personalized text generation consists of multiple stages: retrieval, ranking, summarization, synthesis, and generation. In addition, we introduce a multitask setting that helps the model improve its generation ability further, which is inspired by the observation in education that a student's reading proficiency and writing ability are often correlated. We evaluate our approach on three public datasets, each of which covers a different and representative domain. Our results show significant improvements over a variety of baselines.

LLM 개인화 교육하기 - 글쓰기 교육에서 영감을 받은 접근법

Teach LLMs to Personalize -- An Approach inspired by Writing Education

초록

Support