ChatPaper.aiChatPaper

合成行為基礎的推理鏈:個人理財大型語言模型的數據生成框架

Synthesizing Behaviorally-Grounded Reasoning Chains: A Data-Generation Framework for Personal Finance LLMs

September 17, 2025
作者: Akhil Theerthala
cs.AI

摘要

個性化財務建議需要考慮用戶目標、約束條件、風險承受能力及所屬司法管轄區。先前的大型語言模型(LLM)研究主要集中於投資者和財務規劃師的支持系統。同時,近期眾多研究通過高維護成本的代理管道探討了更廣泛的個人理財任務,包括預算編制、債務管理、退休規劃及遺產規劃,這些管道僅實現了不到25%的預期財務回報。在本研究中,我們引入了一種新穎且可重現的框架,該框架整合了相關的財務背景與行為金融學研究,以構建端到端顧問的監督數據。利用此框架,我們創建了一個包含19,000個樣本的推理數據集,並對Qwen-3-8B模型進行了全面的微調。通過保留測試集分割和盲測LLM評審研究,我們證明,通過精心的數據策展和行為整合,我們的8B模型在事實準確性、流暢性和個性化指標上達到了與顯著更大基線模型(14-32B參數)相當的性能,同時成本比這些更大模型低80%。
English
Personalized financial advice requires consideration of user goals, constraints, risk tolerance, and jurisdiction. Prior LLM work has focused on support systems for investors and financial planners. Simultaneously, numerous recent studies examine broader personal finance tasks, including budgeting, debt management, retirement, and estate planning, through agentic pipelines that incur high maintenance costs, yielding less than 25% of their expected financial returns. In this study, we introduce a novel and reproducible framework that integrates relevant financial context with behavioral finance studies to construct supervision data for end-to-end advisors. Using this framework, we create a 19k sample reasoning dataset and conduct a comprehensive fine-tuning of the Qwen-3-8B model on the dataset. Through a held-out test split and a blind LLM-jury study, we demonstrate that through careful data curation and behavioral integration, our 8B model achieves performance comparable to significantly larger baselines (14-32B parameters) across factual accuracy, fluency, and personalization metrics while incurring 80% lower costs than the larger counterparts.
PDF02September 18, 2025