Statistics and Its Interface

Volume 5 (2012)

Number 1

Protein structural model selection based on protein-dependent scoring function

Pages: 109 – 115

DOI: https://dx.doi.org/10.4310/SII.2012.v5.n1.a10

Authors

Zhiquan He (Department of Computer Science, University of Missouri, Columbia, Mo., U.S.A.)

Yi Shang (Department of Computer Science, University of Missouri, Columbia, Mo., U.S.A.)

Dong Xu (Department of Computer Science, University of Missouri, Columbia, Mo., U.S.A.)

Yang Xu (Department of Computer Science, University of Missouri, Columbia, Mo., U.S.A.)

Jingfen Zhang (Department of Computer Science, University of Missouri, Columbia, Mo., U.S.A.)

Abstract

Selection of good models from a structural model pool is an important and challenging step in protein structure prediction. While various score functions have been developed, their applications in protein structure predictions are unsatisfactory. In this study, we developed a novel two-stage optimization method which effectively combines a set of basic scoring functions for improving the selection performance. In the first stage of protein-dependent optimization, this method combines seven scoring functions and optimizes the weights among them on the model pool of each protein. In the second stage, the method integrates scores with optimized protein-dependent weights, and then seeks correlations among these scores and structural features using a Support Vector Machine (SVM) to predict the quality of protein structures. Test results on two benchmarks from different model generation methods showed that the sum of basic scoring functions with optimized weights achieved better model selection performance than any individual scoring function or equal-weight combination of these scoring functions. A leave-one-out test demonstrated further improvement in the second stage over the score of the weighted sum.

Keywords

protein model selection, score combination, scoring functions

Published 17 February 2012