ChatPaper.aiChatPaper

將教師與學生偏好對齊,以便生成定制的訓練數據。

Aligning Teacher with Student Preferences for Tailored Training Data Generation

June 27, 2024
作者: Yantao Liu, Zhao Zhang, Zijun Yao, Shulin Cao, Lei Hou, Juanzi Li
cs.AI

摘要

大型語言模型(LLMs)在各種任務中展現出顯著的潛力,作為助手。在處理隱私敏感數據或延遲敏感任務時,將LLMs在邊緣設備上進行本地部署是必要的。這些設備的計算限制使得直接部署強大的大規模LLMs變得不切實際,因此需要從大規模模型到輕量級模型的知識蒸餾。已經有很多工作從LLMs中獲取多樣性和高質量的訓練示例,但很少有人注意到基於學生喜好來調整教師指導內容,類似於教學法中的“響應式教學”。因此,我們提出ARTE,全名Aligning TeacheR with StudenT PreferencEs,這是一個框架,將教師模型與學生喜好對齊,以生成針對知識蒸餾的定制訓練示例。具體而言,我們從教師模型中獲取初步問題和原因,然後使用學生在上下文學習中的表現作為代理來收集學生對這些問題和原因的喜好,最後將教師模型與學生喜好對齊。最後,我們使用對齊的教師模型重複第一步,為目標任務上的學生模型獲取定制的訓練示例。在學術基準測試中進行的大量實驗顯示,ARTE相對於從強大LLMs蒸餾出的現有指導調整數據集具有優越性。此外,我們深入研究了ARTE的泛化能力,包括在推理能力方面對微調的學生模型和對齊的教師模型在跨任務和學生間生成定制訓練數據的泛化能力。總之,我們的貢獻在於提出了一個新穎的框架用於定制訓練示例生成,展示了其在實驗中的有效性,並調查了ARTE中學生和對齊教師模型的泛化。
English
Large Language Models (LLMs) have shown significant promise as copilots in various tasks. Local deployment of LLMs on edge devices is necessary when handling privacy-sensitive data or latency-sensitive tasks. The computational constraints of such devices make direct deployment of powerful large-scale LLMs impractical, necessitating the Knowledge Distillation from large-scale models to lightweight models. Lots of work has been done to elicit diversity and quality training examples from LLMs, but little attention has been paid to aligning teacher instructional content based on student preferences, akin to "responsive teaching" in pedagogy. Thus, we propose ARTE, dubbed Aligning TeacheR with StudenT PreferencEs, a framework that aligns the teacher model with student preferences to generate tailored training examples for Knowledge Distillation. Specifically, we elicit draft questions and rationales from the teacher model, then collect student preferences on these questions and rationales using students' performance with in-context learning as a proxy, and finally align the teacher model with student preferences. In the end, we repeat the first step with the aligned teacher model to elicit tailored training examples for the student model on the target task. Extensive experiments on academic benchmarks demonstrate the superiority of ARTE over existing instruction-tuning datasets distilled from powerful LLMs. Moreover, we thoroughly investigate the generalization of ARTE, including the generalization of fine-tuned student models in reasoning ability and the generalization of aligned teacher models to generate tailored training data across tasks and students. In summary, our contributions lie in proposing a novel framework for tailored training example generation, demonstrating its efficacy in experiments, and investigating the generalization of both student & aligned teacher models in ARTE.

Summary

AI-Generated Summary

PDF262November 29, 2024