Statistics and Its Interface

Volume 9 (2016)

Number 3

An EM algorithm for click fraud detection

Pages: 389 – 394

DOI: https://dx.doi.org/10.4310/SII.2016.v9.n3.a12

Authors

Xuening Zhu (Guanghua School of Management, Peking University, Beijing, China)

Da Huang (School of Management, Fudan University, Shanghai, China)

Rui Pan (School of Statistics and Mathematics, Central University of Finance and Economics, Beijing, China)

Hansheng Wang (Guanghua School of Management, Peking University, Beijing, China)

Abstract

This paper is concerned with the problem of click fraud detection. We assume each visitor of a website carries a latent indicator, which labels him/her as a regular or malicious user. Information such as number of clicks, number of page views (PVs) and time difference between consecutive clicks are cooperated in our newly proposed statistical model. We allow those random variables to share the same distribution but with different parameters according to the visitor’s type. An EM algorithm is then suggested to obtain the maximum likelihood estimator. As a result, click fraud detection can be implemented by estimating the posterior malicious probability of each visitor. Simulation studies are conducted to assess the finite sample performance. We also demonstrate the usefulness of the proposed method via an empirical analysis of a real life example on search engine marketing.

Keywords

EM algorithm, maximum likelihood estimator, search engine marketing

2010 Mathematics Subject Classification

62H30

Published 27 January 2016