cover

Bibliografische Information der Deutschen Nationalbibliothek:

Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.dnb.de abrufbar.

© 2013 Dr. Volker Egelhofer

Herstellung und Verlag:

BoD – Books on Demand GmbH, Norderstedt

ISBN: 978-3-7357-3254-5

Inhaltsverzeichnis

Introduction

In the past, Bioinformatics has been mostly classified as a bridge discipline between Informatics and Biology rather than an independent (stand-alone) scientific discipline. But with the comprehensive accumulation of biological data and the resulting challenges Bioinformaticians concentrate more on their own research rather than simply serving as technologists for others. Nowadays, the focus in Bioinformatics is mainly on the development of sophisticated algorithms capable of extracting useful knowledge from large data sets by combining methods from statistics and artificial intelligence. Computational Proteomics and Metabolomics are the most emerging fields of bioinformatics. Metabolomics is the study of the small molecules (metabolites) in biological samples such as cells or tissues. This includes their identification and quantification as well as the interaction between them.

Proteomics is the study of the entire set of proteins of a given cell type, cell compartment or specific tissue under defined conditions at a specific time. Combined with high-resolution mass spectrometry (MS) the technology allows the identification and quantification of thousands of proteins. Protein separation by 2D- gel electrophoresis (2D-PAGE), followed by mass spectrometry (MS) or tandem mass spectrometry (MS/MS) identification is the classical method for protein identification by mass spectrometry. In the currently predominated shotgun proteomics strategy, a proteolytic digest of the protein sample is analysed by LC-MS/MS. In such a pipeline, one MS1 (full scan) spectrum is obtained roughly every second and a set of the most abundant ions from the MS1 scan are selected for fragmentation and recording of MS2 spectra (MS/MS).

The identification of peptides from acquired MS/MS spectra is either performed using a database search approach or a spectrum-spectrum search. In the first case the experimental m/z values are compared to calculated m/z values derived from peptides produced by an in silico digestion of a protein sequence library. This approach implies a robust and reliable functional protein sequence annotation. In the latter case the identification of peptides is carried out by matching spectra of unknown shotgun analyses against the reference spectra of the library. An advantages of that approach is a more simple visual discrimination of false positive identifications as well as a search time reduction due to smaller number of peptide sequence information.

***

Aim of the book

The aim of the book is to give the reader a basic insight of Mass spectrometry-based bioinformatics. This beginners guide will help to illustrate some of the common algorithmic problems occurring in typical high throughput mass spectrometry protein identification experiments. A general introduction to Python programming language including standard programming techniques and their role in problem solving will be provided.

***

Book organization

***

Programming languages

A programming language is to be used for controlling the behavior of a computer and like human languages programming languages have syntactic and semantic rules used to define meaning.

Possible Classification