Los datos sintéticos ficticios pueden mejorar la factualidad de LLM a través del aprendizaje previo.

Resumen

Estudios recientes han identificado un factor agravante de las alucinaciones de LLM como la inconsistencia de conocimiento entre el pre-entrenamiento y el ajuste fino, donde datos de ajuste fino no familiares llevan al LLM a fabricar salidas plausibles pero incorrectas. En este documento, proponemos una nueva estrategia de ajuste fino llamada Prereq-Tune para abordar esta inconsistencia de conocimiento y reducir las alucinaciones. Fundamentalmente, Prereq-Tune desenreda el aprendizaje de habilidades y conocimiento, de modo que el modelo aprende solo las habilidades de la tarea sin ser afectado por la inconsistencia de conocimiento. Para lograr esto, Prereq-Tune introduce una etapa adicional de aprendizaje de requisitos previos para aprender el conocimiento necesario para SFT, permitiendo que el SFT subsiguiente se enfoque solo en las habilidades de la tarea. Prereq-Tune también puede combinarse con datos sintéticos ficticios para mejorar la fundamentación de las salidas de LLM en su conocimiento interno. Los experimentos muestran que Prereq-Tune supera a las líneas de base existentes en la mejora de la factualidad de LLM en tareas de preguntas y respuestas cortas y generación de texto extenso. También abre nuevas posibilidades para la generación controlada por conocimiento en LLMs. Nuestro código está disponible en https://github.com/UCSB-NLP-Chang/Prereq_tune.git.

English

Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune outperforms existing baselines in improving LLM's factuality across short QA and long-form generation tasks. It also opens new possibilities for knowledge-controlled generation in LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/Prereq_tune.git.

Los datos sintéticos ficticios pueden mejorar la factualidad de LLM a través del aprendizaje previo.

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

Resumen

Support