에이전트 시대의 인과 발견

초록

최근 대규모 언어 모델(LLM)과 인과 발견을 결합하려는 시도들은 모델이 쌍별 방향성을 추론하거나, 그래프 구조를 제안하거나, 언어 모델 출력을 사전 정보 및 제약 조건으로 주입하도록 요구한다. 이러한 접근법은 더 빠른 분석을 약속하지만, 인과적 증거가 데이터와 가정에 의해 뒷받침되는 것인지, 아니면 텍스트 연관성, 프롬프트 인공물 및 환각 메커니즘에 의한 것인지 모호하게 만든다. 우리는 인과 발견에서 에이전트의 역할을 달리 주장한다. 에이전트는 데이터를 검사하고, 맥락을 검색하며, 방법론의 가정을 설명하고, 그래프 출력을 명확히 해야 하지만, 엣지, 방향성, 사전 정보, 제약 조건 또는 인과 결론을 제공해서는 안 된다. 우리는 에이전트가 워크플로를 보조하는 반면, 인과적 주장은 데이터, 명시적 가정, 공식 알고리즘, 진단 및 사용자나 도메인 전문가의 결정에 근거해야 한다는 원칙을 제안한다. 우리는 이 원칙을 causal-learn+라는 온라인 플랫폼에 구현하였으며, 이 플랫폼은 데이터 분석, 전처리, 방법 추천, 전문 지식 통합, 공식 발견 및 해석을 causal-learn의 알고리즘 생태계를 중심으로 조정한다. Big Five 성격 데이터에 대한 사례 연구는 언어 모델의 신뢰성 부족을 인과적 증거로 전환하지 않으면서 에이전트가 지원하는 인과 발견 파이프라인을 보여준다. 플랫폼은 causallearn.com에서 이용 가능하다.

English

Recent attempts to combine large language models (LLMs) with causal discovery ask models to infer pairwise directions, propose graph structures, or inject language-model outputs as priors and constraints. These approaches promise faster analysis, but they also obscure whether a causal evidence is supported by data and assumptions or by textual associations, prompt artifacts and hallucinated mechanisms. We argue for a different role for agents in causal discovery. Agents should inspect data, retrieve context, explain method assumptions and clarify graph outputs, but they should not supply edges, orientations, priors, constraints or causal conclusions. We propose the principle that agents assist the workflow, while causal claims remain grounded in data, explicit assumptions, formal algorithms, diagnostics and user or domain-expert decisions. We instantiate this principle in causal-learn+, an online platform that coordinates data analysis, preprocessing, method recommendation, expert-knowledge incorporation, formal discovery and interpretation around the algorithmic ecosystem of causal-learn. A case study on Big Five personality data illustrates agent-assisted pipeline of causal discovery without turning language-model unreliability into causal evidence. The platform is available at causallearn.com.