大型语言模型作为税务律师：法律能力出现的案例研究

摘要

更好地理解大型语言模型（LLMs）在法律分析方面的能力有助于提高法律服务的效率，监管人工智能，并利用LLMs来识别法律中的不一致之处。本文探讨了LLMs在应用税法方面的能力。我们选择这个法律领域，因为它具有一种结构，使我们能够在成千上万的示例中建立自动化验证流程，需要逻辑推理和数学技能，并使我们能够以与公民和公司的现实经济生活相关的方式测试LLMs的能力。我们的实验表明，新兴的法律理解能力，随着每一次后续OpenAI模型发布而提高。我们尝试检索和利用相关的法律权威来评估向LLMs提供额外法律背景的影响。发现，少样本提示，展示问题-答案对的示例，也被发现明显提升了最先进模型GPT-4的性能。研究结果表明，LLMs，特别是当结合提示增强和正确的法律文本时，可以在高准确度水平上执行，但尚未达到专业税务律师的水平。随着LLMs的不断进步，它们自主推理法律的能力可能对法律行业和人工智能治理产生重大影响。

English

Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence, and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test LLM capabilities in a manner relevant to real-world economic lives of citizens and companies. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. We experiment with retrieving and utilising the relevant legal authority to assess the impact of providing additional legal context to LLMs. Few-shot prompting, presenting examples of question-answer pairs, is also found to significantly enhance the performance of the most advanced model, GPT-4. The findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels. As LLMs continue to advance, their ability to reason about law autonomously could have significant implications for the legal profession and AI governance.

大型语言模型作为税务律师：法律能力出现的案例研究

Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence

摘要

Support