대형 언어 모델을 세무 변호사로 활용: 법적 역량 출현에 대한 사례 연구

초록

대형 언어 모델(LLM)의 법률 분석 능력을 더 깊이 이해하는 것은 법률 서비스의 효율성 향상, 인공지능 통제, 그리고 법률 내 불일치를 식별하기 위해 LLM을 활용하는 데 기여할 수 있습니다. 본 논문은 세법 적용에 있어 LLM의 능력을 탐구합니다. 우리가 세법을 선택한 이유는 수천 가지 사례에 걸쳐 자동화된 검증 파이프라인을 구축할 수 있는 구조를 가지고 있으며, 논리적 추론과 수학적 능력을 요구하며, 시민과 기업의 실제 경제 생활과 관련된 방식으로 LLM의 능력을 테스트할 수 있기 때문입니다. 우리의 실험은 OpenAI 모델의 각 후속 출시마다 성능이 개선되며, LLM의 법적 이해 능력이 점차 발전하고 있음을 보여줍니다. 우리는 관련 법적 권위를 검색하고 활용하여 추가적인 법적 맥락을 제공했을 때 LLM에 미치는 영향을 평가했습니다. 또한, 질문-답변 쌍의 예시를 제시하는 퓨샷 프롬프팅(few-shot prompting)이 가장 발전된 모델인 GPT-4의 성능을 크게 향상시키는 것으로 나타났습니다. 연구 결과는 LLM이, 특히 프롬프팅 개선과 올바른 법률 텍스트와 결합되었을 때, 높은 수준의 정확도를 달성할 수 있지만 아직 전문 세무 변호사 수준에는 미치지 못한다는 것을 보여줍니다. LLM이 계속 발전함에 따라, 법률에 대해 자율적으로 추론하는 능력은 법률 전문직과 AI 통제에 중대한 영향을 미칠 수 있습니다.

English

Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence, and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test LLM capabilities in a manner relevant to real-world economic lives of citizens and companies. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. We experiment with retrieving and utilising the relevant legal authority to assess the impact of providing additional legal context to LLMs. Few-shot prompting, presenting examples of question-answer pairs, is also found to significantly enhance the performance of the most advanced model, GPT-4. The findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels. As LLMs continue to advance, their ability to reason about law autonomously could have significant implications for the legal profession and AI governance.

대형 언어 모델을 세무 변호사로 활용: 법적 역량 출현에 대한 사례 연구

Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence

초록

Support