Mining Tandem Mass Spectral Data to Develop a More Accurate Mass Error Model for Peptide Identification

Fu Y, Gao W, He S, Sun R, Zhou H, Zeng R

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China

Pac Symp Biocomput. 2007;:421-432.


peptide identification by tandem mass spectra. Previous mass error models are the simplistic uniform or normal distribution with empirically set parameter values. In this paper, we propose a more accurate mass error model, namely conditional normal model, and an iterative parameter learning algorithm. The new model is based on two important observations on the mass error distribution, i.e. the linearity between the mean of mass error and the ion mass, and the log-log linearity between the standard deviation of mass error and the peak intensity. To our knowledge, the latter quantitative relationship has never been reported before. Experimental results demonstrate the effectiveness of our approach in accurately quantifying the mass error distribution and the ability of the new model to improve the accuracy of peptide identification.

[Full-Text PDF] [PSB Home Page]