Statistics and Its Interface

Volume 11 (2018)

Number 1

Data stream clustering by fast density-peak-search

Pages: 183 – 189

DOI: https://dx.doi.org/10.4310/SII.2018.v11.n1.a15

Authors

Jinxia Su (School of Mathematics and Statistics, Lanzhou University, Lanzhou, China)

Yanwen Li (College of Information Science and Engineering, Shanxi Agricultural University, Jinzhong, Shanxi, China)

Xuejing Zhao (School of Mathematics and Statistics, Lanzhou University, Lanzhou, China)

Abstract

Data stream mining has recently been studied extensively in the literature. Many clustering algorithms were proposed to handle massive streams of data. However, many of these algorithms may not be as efficient as one desires for data streams, as they typically require a number of iterations in their implementations. In this paper, we will propose a new data stream clustering algorithm, based on the fast densitypeak- search method. It does not require any iterations in its implementation, and therefore is most suitable for large streams of data. The comparisons of numerical illustration as well as a real example will be made with other alternative data stream algorithms.

Keywords

clustering, data stream, Gaussian kernel density, centrifugal distance, density peaks

2010 Mathematics Subject Classification

Primary 62-07. Secondary 68U20.

The project was sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, Ministry of Education of China, Supported by National Natural Science Foundation of China (No. 11571156) and by the Fundamental Research Funds for the Central Universities, Lanzhou University, P.R. China (No. lzujbky-2012-15,lzujbky-2013-178).

Received 31 October 2016

Published 23 August 2017