BITS - Protein inference from mass spectrometry data

944 views
855 views

Published on

This is the fifth presentation of the BITS training on 'Mass spec data processing'.

It reviews the problems of determining protein sequences of mass spec data, how to deal with it, with an overview of useful tools.

Thanks to the Compomics Lab of the VIB for their contribution.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
944
On SlideShare
0
From Embeds
0
Number of Embeds
178
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

BITS - Protein inference from mass spectrometry data

  1. 1. http://www.bits.vib.be/training
  2. 2. peptide validation and protein inference kenny helsens kenny.helsens@ugent.be Lennart MARTENS lennart.martens@ebi.ac.uk Computational Omics and Systems Biology Group Proteomics Services Group European Bioinformatics Institute Department of Medical Protein Research, VIB Hinxton, Cambridge United Kingdom Department of Biochemistry, Ghent University www.ebi.ac.ukKenny Helsens Ghent, Belgium BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  3. 3. Data processing and information ambiguity Raw data Peaklists Peptide sequences Protein accession numbers ambiguity data size See: Martens and Hermjakob, Molecular BioSystems, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  4. 4. PEPTIDE IDENTIFICATION VALIDATIONKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  5. 5. Populations and individuals 10,000 peptide-to-spectrum matches 5% decoy hitsKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  6. 6. Eliminating false positives Suspect peptide identifications happen. The problem is that finding them requires detailed analysis of a single spectrum and its identifications, amongst thousands of other spectra…Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  7. 7. Automated interpretation The Netherlands??Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  8. 8. Manual interpretation Tyrosine phosporylation See: Ghesquière and Helsens, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  9. 9. Peptizer expert system Agent c Agent b Agent d Agent a Agent e Vote casts +1 +1 0 -1 +1 Aggregation of the votes Confident Peptide Identifications Suspicious Trusted subset subset See: Helsens et al, Molecular and Cellular Proteomics, 2008Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  10. 10. Peptizer expert system See: Helsens et al, Molecular and Cellular Proteomics, 2008Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  11. 11. Peptizer expert system See: Helsens et al, Molecular and Cellular Proteomics, 2008Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  12. 12. PROTEIN INFERENCEKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  13. 13. Not all peptides are created equal Gene 1a 1b 2 3 4 5 6a 6b Transcripts 1a 1b 2 5 6a 6b 1a 1b 2 3 5 6a 6b 1b 2 3 4 5 6a 6b 1a 1b 2 3 4 5 6a Translations 2 5 2 3 5 Peptides 2 3 4 5 matching all transcripts 2 3 4 5 redundant matching a transcript subset matching exactly 1 translation Intron Exon UTR Exon CDS PeptideKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  14. 14. Sample preparation consequences See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  15. 15. Sample preparation consequences See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  16. 16. Protein inference: a question of conviction peptides a b c d proteins prot X x x Minimal set Occam { prot Y prot Z x x x x peptides a b c d proteins prot X x x Maximal set anti-Occam { prot Y prot Z x x x x peptides a b c d proteins prot X (-) x x Minimal set with maximal annotation { prot Y (+) prot Z (0) x x x x true Occam? See: Martens and Hermjakob, Molecular BioSystems, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  17. 17. ALGORITHMS FOR THE PROTEIN INFERENCE PROBLEMKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  18. 18. A few algorithms for protein inference • IDPicker Zhang et al, Journal of Proteome Research, 2007 • ProteinProphet Nesvizhskii AI et al, Analytical Chemistry, 2003 • DBToolkit Martens et al, Bioinformatics, 2005 http://genesis.UGent.be/dbtoolkitKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  19. 19. IDPicker parsimonious protein assembly (I) Initialize See: Zhang et al, Journal of Proteome Research, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  20. 20. IDPicker parsimonious protein assembly (II) Collapse See: Zhang et al, Journal of Proteome Research, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  21. 21. IDPicker parsimonious protein assembly (III) Separate See: Zhang et al, Journal of Proteome Research, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  22. 22. IDPicker parsimonious protein assembly (IV) Reduce See: Zhang et al, Journal of Proteome Research, 2007Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  23. 23. ProteinProphet: the simplified view peptide peptide protein probability weight probability peptide probability In iteration 1, all weights w start off as 1/n, with n the degeneracy count for the peptide See: Nesvizhskii AI et al., Analytical Chemistry, 2003Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  24. 24. DBToolkit protein inference peptides a b cd proteins prot X (-) x x Minimal set with maximal annotation { prot Y (+) prot Z (0) x x x xKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  25. 25. Some indications from the HUPO BPP peptides a b c d proteins prot X (-) x x prot Y (+) x prot Z (0) x x xKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  26. 26. PROTEIN INFERENCE AND QUANTIFICATIONKenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  27. 27. Some inference examples (i) http://genesis.ugent.be/rover/ Nice and easy, 1/1, only unique peptides (blue) and a narrow distribution See: Colaert et al, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  28. 28. Some inference examples (ii) http://genesis.ugent.be/rover/ Nice and easy, down-regulated See: Colaert et al, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  29. 29. Some inference examples (iii) http://genesis.ugent.be/rover/ A little less easy, up-regulated See: Colaert et al, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  30. 30. Some inference examples (iv) http://genesis.ugent.be/rover/ A nice example of the mess of degenerate peptides See: Colaert et al, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  31. 31. Some inference examples (v) http://genesis.ugent.be/rover/ A bit of chaos, but a defined core distribution See: Colaert et al, Proteomics, 2010Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011
  32. 32. Thank you! Questions?Kenny Helsens BITS MS Data Processing – Protein Inferencekenny.helsens@UGent.be UGent, Gent, Belgium – 16 December 2011

×