Methods for efficient protein identification based on mass spectrometry used for analysis of the mitochondrial outer membrane proteome of baker's yeast Saccharomyces cerevisiae
University of Würzburg, Germany, 2006

The data processing in mass spectrometry based Proteomics from recording of the spectra on over and over again reveiled as a bottleneck in Proteomics projects. The analysis of complex samples needs efficient software support. The present work focusses on this problem and provides solutions.
After solving this problem, important proteins of the mitochondrial proteome of Saccharomyces cerevisiae, the human platelet and of Dictyostelium discoideum could be identified efficiently, including some modifications. The developed procedures were used while completing the mitochondrial proteome of Saccharomyces cerevisiae and were applied for the analysis of the centrosomal proteome of Dictyostelium discoideum.
The results obtained by using the developed computerized methods were published by the users in peer-reviewed scientific journals (Sickmann, A. et al., 2003, Proceedings of the National Academy of Sciences (PNAS), 23 (100), 13207-13212; Moebius, J. et al., 2005, Molecular & Cellular Proteomics, 4 (11), 1754-1761; Zahedi, R. P. et al., 2005, Proteomics, 14 (4), 3581-3588; Zahedi, R. P. et al., 2006, Molecular Biology of the Cell, 17 (3), 1436-1450; Lewandrowski, U. et al., 2006, Molecular & Cellular Proteomics, 5 (2), 226-233; Reinders, Y. et al., 2006, Journal of Proteome Research, 5 (3), 589 -598).
The aim is to implement and provide an infrastructure suitable for mass data analysis and for complex data mining within the field of protein mass spectrometry. For each single project step in data analysis of mass spectrometry based Proteomics projects, efficient support by software tools was designed and implemented. For instance, the data conversion was optimized and an integrated platform for data interpretation algorithms was created and established. In order to realize this, solutions for data conversions must be implemented on one hand in oder to obtain mass spectrum data in a portable format, on the other hand in order to provide the interpretation results for further processing.

The special qualities of the system named paOla consist of using a relational database system, allowing for complex queries on the data. By this, the amount of data as usually acquired in Proteome studies, can be handled efficiently. Besides, a major module was designed, allowing for implementing a system that is capable of integrating commonly used algorithms for interpreation of mass spectrometric data by means of distributed systems including sharing computers. This yields time savings, measured in real time. paOla is designed as an open system for easy integration of further software tools. A scoring scheme was developed, allowing for evaluating consensus results obtained from peptide identifications. This score is visualized by applying polymetric views.
The system is capable of keeping an integrated protein sequence database up-to-date. This database is fully non-redundant and provides a solution for the alias-problem of protein names and accession identifiers. It can export sequence database files, suitable as basis for protein identifications by mass spectrometry.
A major part of this work the system paOla participated as a finalist at the European Academic Software Award 2004 of the European Knowledge Media Association. All relevant results have been published in peer-reviewed scientific journals (Boehm, A. M. et al., 2004, Bioinformatics, 20, 2889-2891; Boehm, A. M. et al., 2004, BMC Bioinformatics, 5, 162; Boehm, A. M. et al., 2005, FEBS Journal, 272, A3-011P; Boehm, A. M. et al., 2004, BMC Bioinformatics, 5, 162; Grosse-Coosmann, F. et al., 2005, BMC Bioinformatics, 6, 290; Zahedi, R. P. et al., 2006, Molecular Biology of the Cell, 17 (3), 1436-1450; Boehm, A. M. et al., 2006, Proteomics, 6 (15), 4223 - 4226). In addition, selected results have been presented at the meetings Beyond Genome 2004, GCB 2004, GBM Herbsttagung 2004, EASA/EKMA 2004, Igler MS-Tage 2005, FEBS Kongress 2005, GBM Herbsttagung 2005, GCB 2005 and GvD Workshop of the Gesellschaft für Informatik 2006.