Statistics and Its Interface

Volume 15 (2022)

Number 3

Link prediction via latent space logistic regression model

Pages: 267 – 282

DOI: https://dx.doi.org/10.4310/21-SII684

Authors

Rui Pan (Central University of Finance and Economics, Beijing, China)

Xiangyu Chang (Xi’an Jiaotong University, Xi’an, Shaanxi, China)

Xuening Zhu (Fudan University, Shanghai, China)

Hansheng Wang (Peking University, Beijing, China)

Abstract

Nowadays, link prediction is of vital importance in the operation of social network platforms. One typical application is to make accurate recommendation to enhance users’ activeness. In this article, we propose a latent space logistic regression model for link prediction. The model takes both the users’ attributes and the latent social space into consideration. Two pseudo maximum likelihood estimators are proposed for parameter estimation. They correspond to the concepts of reciprocity and transitivity, respectively, and are computationally efficient for large-scale social networks. Extensive simulation studies are provided to evaluate the finite sample performance of the newly proposed methodology. At last, a real data set of Sina Weibo is presented for illustration purposes.

Keywords

latent social space, link prediction, logistic regression, network topology, social networks

The research of Rui Pan is supported in part by National Natural Science Foundation of China (Nos. 11601539, 11631003), the disciplinary funding and the Emerging Interdisciplinary Project of Central University of Finance and Economics. The research of Xiangyu Chang is supported in part by the National Natural Science Foundation of China (No. 11771012) and Natural Science Foundation of Shannxi Province of China (No. 2021JC-01). The research of Xuening Zhu is supported by the National Natural Science Foundation of China (Nos. 11901105, 71991472, U1811461), the Shanghai Sailing Program for Youth Science and Technology Excellence (19YF1402700). The research of Hansheng Wang is partially supported by National Natural Science Foundation of China (No. 11831008) and also partially supported by the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science (KLATASDSMOE-ECNU-KLATASDS2101).

Received 10 April 2020

Accepted 6 June 2021

Published 14 February 2022