大型語言模型作為稅務律師:法律能力出現的案例研究
Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence
June 12, 2023
作者: John J. Nay, David Karamardian, Sarah B. Lawsky, Wenting Tao, Meghana Bhat, Raghav Jain, Aaron Travis Lee, Jonathan H. Choi, Jungo Kasai
cs.AI
摘要
更好地了解大型語言模型(LLMs)在法律分析方面的能力有助於提高法律服務的效率,管理人工智能,並利用LLMs來識別法律中的不一致之處。本文探討LLMs在應用稅法方面的能力。我們選擇這個法律領域,因為它具有一種結構,使我們能夠在數千個示例中建立自動驗證流程,需要邏輯推理和數學技能,並使我們能夠測試LLMs在與公民和公司的現實經濟生活相關的方式中的能力。我們的實驗表明出新興的法律理解能力,並且在每個後續的OpenAI模型發布中性能有所提升。我們嘗試檢索並利用相關的法律權威來評估向LLMs提供額外法律背景資料的影響。少量提示,即呈現問答對的示例,也被發現顯著提升了最先進模型GPT-4的性能。研究結果表明,LLMs,特別是當結合提示增強和正確的法律文本時,可以以高水準的準確性執行,但尚未達到專業稅務律師的水平。隨著LLMs的不斷進步,它們自主推理法律的能力可能對法律界和人工智能治理產生重大影響。
English
Better understanding of Large Language Models' (LLMs) legal analysis
abilities can contribute to improving the efficiency of legal services,
governing artificial intelligence, and leveraging LLMs to identify
inconsistencies in law. This paper explores LLM capabilities in applying tax
law. We choose this area of law because it has a structure that allows us to
set up automated validation pipelines across thousands of examples, requires
logical reasoning and maths skills, and enables us to test LLM capabilities in
a manner relevant to real-world economic lives of citizens and companies. Our
experiments demonstrate emerging legal understanding capabilities, with
improved performance in each subsequent OpenAI model release. We experiment
with retrieving and utilising the relevant legal authority to assess the impact
of providing additional legal context to LLMs. Few-shot prompting, presenting
examples of question-answer pairs, is also found to significantly enhance the
performance of the most advanced model, GPT-4. The findings indicate that LLMs,
particularly when combined with prompting enhancements and the correct legal
texts, can perform at high levels of accuracy but not yet at expert tax lawyer
levels. As LLMs continue to advance, their ability to reason about law
autonomously could have significant implications for the legal profession and
AI governance.