논문 출처 추적을 위한 텍스트 기반 신경 협업 필터링 모델

초록

복잡한 인용 지식 그래프 내에서 중요한 참고문헌을 식별하는 것은 도전적인 과제입니다. 이 그래프는 인용, 저자, 키워드 및 기타 관계적 속성을 통해 연결되어 있습니다. 논문 출처 추적(PST) 작업은 고급 데이터 마이닝 기술을 활용하여 주어진 학술 논문의 핵심 참고문헌을 자동으로 식별하는 것을 목표로 합니다. KDD CUP 2024에서 우리는 PST 작업에 맞춤화된 추천 기반 프레임워크를 설계했습니다. 이 프레임워크는 최종 예측을 생성하기 위해 신경 협업 필터링(NCF) 모델을 사용합니다. 논문의 텍스트 속성을 처리하고 모델의 입력 특징을 추출하기 위해 사전 훈련된 언어 모델인 SciBERT를 활용합니다. 실험 결과에 따르면, 우리의 방법은 평균 정밀도(MAP) 지표에서 0.37814의 점수를 달성하여 기준 모델들을 능가했으며, 모든 참가 팀 중 11위를 기록했습니다. 소스 코드는 https://github.com/MyLove-XAB/KDDCupFinal에서 공개적으로 제공됩니다.

English

Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP 2024, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal.

논문 출처 추적을 위한 텍스트 기반 신경 협업 필터링 모델

Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing

초록

Support