AutoMIA: 에이전트 자기 탐색을 통한 멤버십 추론 공격 기법의 개선된 베이스라인

초록

멤버십 추론 공격(MIA)은 머신러닝 모델의 훈련 데이터 누출 평가를 위한 기본적인 감사 도구로 활용됩니다. 그러나 기존 방법론은 주로 정적이고 수작업 방식의 휴리스틱에 의존하여 적응성이 부족하며, 서로 다른 대규모 모델 간에 적용될 때 종종 최적에 미치지 못하는 성능을 보입니다. 본 연구에서는 멤버십 추론을 자동화된 자기 탐색 및 전략 진화 과정으로 재정의하는 에이전트 기반 프레임워크인 AutoMIA를 제안합니다. 높은 수준의 시나리오 명세가 주어지면 AutoMIA는 실행 가능한 로짓 수준 전략을 생성하고 폐쇄형 평가 피드백을 통해 점진적으로 개선함으로써 공격 공간을 자기 탐색합니다. 추상적 전략 추론과 낮은 수준의 실행을 분리함으로써, 우리의 프레임워크는 모델에 구애받지 않는 체계적인 공격 탐색 공간 순회를 가능하게 합니다. 대규모 실험을 통해 AutoMIA가 수동 특징 공학의 필요성을 제거하면서도 최신 기준선을 지속적으로 따라가거나 능가하는 성능을 보임을 입증합니다.

English

Membership Inference Attacks (MIAs) serve as a fundamental auditing tool for evaluating training data leakage in machine learning models. However, existing methodologies predominantly rely on static, handcrafted heuristics that lack adaptability, often leading to suboptimal performance when transferred across different large models. In this work, we propose AutoMIA, an agentic framework that reformulates membership inference as an automated process of self-exploration and strategy evolution. Given high-level scenario specifications, AutoMIA self-explores the attack space by generating executable logits-level strategies and progressively refining them through closed-loop evaluation feedback. By decoupling abstract strategy reasoning from low-level execution, our framework enables a systematic, model-agnostic traversal of the attack search space. Extensive experiments demonstrate that AutoMIA consistently matches or outperforms state-of-the-art baselines while eliminating the need for manual feature engineering.

AutoMIA: 에이전트 자기 탐색을 통한 멤버십 추론 공격 기법의 개선된 베이스라인

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

초록

Support