以生成式人工智慧擴大交通運輸安全數據的存取:一個基於綱要的空間自然語言查詢框架
Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries
May 20, 2026
作者: Mahdi Azhdari, Eric J. Gonzales
cs.AI
摘要
交通安全性分析需透過基於地理資訊系統的工作流程整合事故記錄、道路屬性及地理空間數據,但各機構與社區利害關係人對此類數據的取得仍存在差異。技術門檻導致安全規劃核心的分析工具與能實際操作這些工具的從業人員之間出現斷層。地方機關、學校委員會及居民雖有安全顧慮,卻缺乏檢索、篩選、繪製及分析相關數據的能力。生成式人工智慧提供了縮小此差距的契機,但其在公共領域的應用引發了可靠性、可再現性及治理等問題。本文提出一套基於架構的自然語言介面,應用於交通安全分析,藉由大型語言模型解讀使用者意圖,同時確保在權威資料庫上執行具確定性且可審查的運算。使用者查詢被轉譯為結構化語意框架,經由規則驗證層校驗後,編譯成形質有向無環圖的空間運算,最終在PostGIS資料庫中執行。此限制性設計將語言解讀與確定性執行分離,不僅確保結果可再現且架構嚴謹,同時消除使用障礙。本研究以麻薩諸塞州全州交通安全資料庫進行評估,該資料庫整合事故記錄、道路屬性及包含學校、公車站、行人穿越道及行政界線的地理空間圖層。所有查詢均成功執行;驗證層在29%的評估查詢中修正錯誤,反映出靈活自然語言與嚴格架構要求之間的落差。研究結果表明,結合自然語言可及性與確定性執行是擴大交通安全數據使用範圍的務實方向,對公共領域規劃中可信賴人工智慧的應用具有啟發意義。
English
Transportation safety analysis requires integrating crash records, roadway attributes, and geospatial data through GIS-based workflows, but access remains uneven across agencies and community stakeholders. Technical prerequisites create a gap between analytical tools central to safety planning and the practitioners able to use them. Local agencies, school committees, and residents may have safety concerns but limited capacity to retrieve, filter, map, and analyze relevant data. Generative AI offers a way to narrow this divide, but its public-sector use raises questions about reliability, reproducibility, and governance. This paper presents a schema-grounded natural language interface for transportation safety analysis, using a large language model (LLM) to interpret user intent while preserving deterministic, reviewable execution against an authoritative database. User queries are translated into structured semantic frames, validated by a rule-based layer, compiled into a typed directed acyclic graph of spatial operations, and executed against a PostGIS database. This bounded design separates language interpretation from deterministic execution, keeping results reproducible and schema-grounded while removing access barriers. The framework is evaluated using a statewide Massachusetts transportation safety database integrating crash records, roadway attributes, and geospatial layers including schools, bus stops, crosswalks, and municipal boundaries. All queries executed successfully; the validation layer corrects errors in 29% of evaluation queries, reflecting the gap between flexible natural language and strict schema-grounded requirements. The results suggest that combining natural language accessibility with deterministic execution is a practical direction for broadening access to transportation safety data, with implications for trustworthy AI in public-sector planning.