合成基于行为逻辑的推理链:面向个人理财大语言模型的数据生成框架
Synthesizing Behaviorally-Grounded Reasoning Chains: A Data-Generation Framework for Personal Finance LLMs
September 17, 2025
作者: Akhil Theerthala
cs.AI
摘要
个性化财务建议需综合考虑用户目标、约束条件、风险承受能力及所在司法管辖区。先前的大型语言模型(LLM)研究主要集中于为投资者和财务规划师提供支持系统。与此同时,近期多项研究通过高维护成本的代理管道探讨了更广泛的个人理财任务,包括预算编制、债务管理、退休及遗产规划,但这些方法仅实现了预期财务回报的不到25%。在本研究中,我们引入了一种新颖且可复现的框架,该框架将相关金融背景与行为金融学研究相结合,为端到端顾问构建监督数据。利用此框架,我们创建了一个包含19,000个样本的推理数据集,并对Qwen-3-8B模型进行了全面微调。通过预留测试集和盲审LLM评审研究,我们证明,通过精细的数据整理与行为整合,我们的8B模型在事实准确性、流畅性和个性化指标上均达到了与显著更大基线模型(14-32B参数)相当的性能,同时成本比这些更大模型降低了80%。
English
Personalized financial advice requires consideration of user goals,
constraints, risk tolerance, and jurisdiction. Prior LLM work has focused on
support systems for investors and financial planners. Simultaneously, numerous
recent studies examine broader personal finance tasks, including budgeting,
debt management, retirement, and estate planning, through agentic pipelines
that incur high maintenance costs, yielding less than 25% of their expected
financial returns. In this study, we introduce a novel and reproducible
framework that integrates relevant financial context with behavioral finance
studies to construct supervision data for end-to-end advisors. Using this
framework, we create a 19k sample reasoning dataset and conduct a comprehensive
fine-tuning of the Qwen-3-8B model on the dataset. Through a held-out test
split and a blind LLM-jury study, we demonstrate that through careful data
curation and behavioral integration, our 8B model achieves performance
comparable to significantly larger baselines (14-32B parameters) across factual
accuracy, fluency, and personalization metrics while incurring 80% lower costs
than the larger counterparts.