Communications in Mathematical Sciences
Volume 16 (2018)
A semi-supervised heat kernel pagerank MBO algorithm for data classification
Pages: 1241 – 1265
We present an accurate and efficient graph-based algorithm for semi-supervised classification that is motivated by recent successful threshold dynamics approaches and derived using heat kernel pagerank. Two different techniques are proposed to compute the pagerank, one of which proceeds by simulating random walks of bounded length. The algorithm produces accurate results even when the number of labeled nodes is very small, and avoids computationally expensive linear algebra routines. Moreover, the accuracy of the procedure is comparable with or better than that of state-of-the-art methods and is demonstrated on benchmark data sets. In addition to the main algorithm, a simple novel procedure that uses heat kernel pagerank directly as a classifier is outlined. We also provide detailed analysis, including information about the time complexity, of all proposed methods.
heat kernel pagerank, graphs, graph Laplacian, threshold dynamics, random walk
2010 Mathematics Subject Classification
35K05, 35K08, 35R02, 60J75, 62-07, 62-09, 65S05, 94C15, 97K30
This work was supported by NSF grants DMS-1417674 and DMS-1118971, ONR grant N00014-16-1-2119 and AFSOR FA9550-09-1-0090. The first author was supported by the UC President’s Postdoctoral Fellowship.
Received 6 July 2016
Received revised 21 April 2018
Accepted 21 April 2018
Published 19 December 2018