구조화된 테이블 발견을 통한 다양한 모델 발견

초록

모델 카드는 텍스트 설명과 성능, 구성, 데이터셋 테이블을 포함한 구조화된 아티팩트(artifacts)의 혼합을 통해 모델 동작을 설명합니다. 기존 모델 검색 시스템은 주로 텍스트에 대한 의미적 유사성(semantic similarity)에 의존하기 때문에 동질적인 결과 집합을 생성하고 대안 탐색을 제한할 수 있습니다. 우리는 모델 검색이 본질적으로 비교적(comparative)이라고 주장합니다. 사용자는 작업에 부합하면서도 측정 가능한 방식으로 차별화된 모델을 원합니다. 우리는 이러한 균형을 위해서는 장황한 설명보다는 압축된 고품질 증거(evidence)에 대한 검색이 필요하며, 그러한 증거의 상당 부분이 구조화된 테이블에 집중되어 있다고 가정합니다. 우리는 ModelTables 벤치마크를 기반으로 구축된 테이블 기반 모델 검색 프레임워크인 StructuredSemanticSearch를 제시합니다. 쿼리가 주어지면 StructuredSemanticSearch는 작업 정렬을 위한 의미적 기준선(semantic baseline)과 합집합 가능성(unionability), 조인 가능성(joinability), 키워드 검색과 같은 테이블 발견 연산자를 사용하여 쿼리 관련 모델 카드 테이블을 발견하는 구조 인식 파이프라인을 결합합니다. 검색된 테이블은 제어된 top-k 예산 하에 모델 카드에 다시 매핑되어 텍스트 기반 검색과 테이블 기반 검색 간의 공정한 비교를 가능하게 합니다. 검색 외에도 StructuredSemanticSearch는 방향 인식 통합(orientation-aware integration)을 통해 테이블 통합을 모델-테이블 도메인에 적용하여 부분적으로 중복되고 때로는 전치된(transposed) 증거 테이블로부터 컴팩트한 통합 뷰를 생성합니다. 평가를 위해 우리는 모델 카드에서 컴팩트한 증거 항목을 추출하고, 쿼리를 조건별 또는 의도별 너겟(nugget)에 매칭하며, 검색된 모델 카드 후보 집합에 대한 증거 범위(coverage)와 다양성(diversity)을 측정하는 너겟 기반의 감사 가능한 프로토콜을 도입합니다. 이 프로토콜은 또한 동적 모델 레이크(dynamic model lakes)에서 근사적이고 증거 기반의 레이블링을 위한 확장 가능한 경로를 제공합니다. 597개의 모델 추천 쿼리에 대한 실험은 구조 인식 파이프라인이 의미적 기준선보다 향상된 너겟 범위를 보여줍니다.

English

Model cards describe model behavior through a mixture of textual descriptions and structured artifacts, including performance, configuration, and dataset tables. Existing model search systems rely predominantly on semantic similarity over text, which can produce homogeneous result sets and limit exploration of alternatives. We argue that model search is inherently comparative: users want models that are task-aligned yet differentiated in measurable ways. We hypothesize that this balance requires retrieval over condensed, high-quality evidence rather than verbose descriptions, and much of that evidence is concentrated in structured tables. We present StructuredSemanticSearch, a table-driven model search framework built on the ModelTables benchmark. Given a query, StructuredSemanticSearch combines a semantic baseline for task alignment with a structure-aware pipeline that discovers query-related model-card tables using table discovery operators such as unionability, joinability, and keyword search. Retrieved tables are mapped back to model cards under a controlled top-k budget, enabling fair comparison between text-based and table-based retrieval. Beyond retrieval, StructuredSemanticSearch adapts table integration to the model-table domain through orientation-aware integration, producing compact integrated views of tables from partially overlapping and sometimes transposed evidence tables. For evaluation, we introduce a nugget-based, auditable protocol that extracts compact evidence items from model cards, matches queries to condition- or intent-specific nuggets, and measures evidence coverage and diversity over retrieved model-card candidate sets. This protocol also provides a scalable path toward approximate, evidence-based labeling in dynamic model lakes. Experiments on 597 model-recommendation queries show improved nugget coverage for the structure-aware pipeline than semantic baseline