行動に基づいた推論チェーンの合成：個人金融向け大規模言語モデルのためのデータ生成フレームワーク

要旨

パーソナライズされた金融アドバイスには、ユーザーの目標、制約、リスク許容度、および管轄区域を考慮する必要があります。これまでのLLM研究は、投資家やファイナンシャルプランナー向けの支援システムに焦点を当ててきました。同時に、最近の多くの研究では、予算管理、債務管理、退職計画、相続計画などのより広範な個人財務タスクを、高い維持コストを伴うエージェント型パイプラインを通じて検討しており、期待される財務リターンの25%未満しか得られていません。本研究では、関連する金融コンテキストと行動ファイナンス研究を統合し、エンドツーエンドのアドバイザー向けの教師データを構築するための新規かつ再現可能なフレームワークを紹介します。このフレームワークを使用して、19kサンプルの推論データセットを作成し、Qwen-3-8Bモデルをこのデータセットで包括的にファインチューニングしました。ホールドアウトテストスプリットとブラインドLLMジャッジスタディを通じて、慎重なデータキュレーションと行動統合により、8Bモデルが事実の正確性、流暢さ、パーソナライゼーションの指標において、大幅に大きなベースラインモデル（14-32Bパラメータ）と同等のパフォーマンスを達成しつつ、それらの大規模モデルに比べて80%低いコストで実現できることを示しました。

English

Personalized financial advice requires consideration of user goals, constraints, risk tolerance, and jurisdiction. Prior LLM work has focused on support systems for investors and financial planners. Simultaneously, numerous recent studies examine broader personal finance tasks, including budgeting, debt management, retirement, and estate planning, through agentic pipelines that incur high maintenance costs, yielding less than 25% of their expected financial returns. In this study, we introduce a novel and reproducible framework that integrates relevant financial context with behavioral finance studies to construct supervision data for end-to-end advisors. Using this framework, we create a 19k sample reasoning dataset and conduct a comprehensive fine-tuning of the Qwen-3-8B model on the dataset. Through a held-out test split and a blind LLM-jury study, we demonstrate that through careful data curation and behavioral integration, our 8B model achieves performance comparable to significantly larger baselines (14-32B parameters) across factual accuracy, fluency, and personalization metrics while incurring 80% lower costs than the larger counterparts.