www.elixir-europe.org
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
Enabling automated processing and
analysis of large-scale proteomics data
Juan Antonio Vizcaíno, EMBL-EBI
Hinxton, Cambridge
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
One slide intro to MS based proteomics
Hein et al., Handbook of Systems Biology, 2012
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
Kickoff meeting for Proteomics activities in ELIXIR
• It took place on March 1-2 2017, at Tuebingen, organised by ELIXIR-Germany.
• ~25 people attending, representing 11 ELIXIR Nodes.
• Outcome: White paper outlining the possible future activities related to
proteomics in the context of ELIXIR. This paper will be submitted to F1000
Research.
• To be available by end of May/ beginning of June.
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
ELIXIR Implementation Project
• 1-year project just started. Led by EMBL-EBI (Vizcaíno) and ELIXIR-Germany
(Kohlbacher, Eisenacher).
• Aim: Development of reproducible data analysis pipelines for shot-gun
proteomics approaches using the OpenMS framework.
• Deployment of the pipelines in the EMBL-EBI “Embassy cloud” as proof of
concept:
• Facilitate future deployment in other cloud environments.
• Direct connection with public datasets in the PRIDE database.
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
• PRIDE is the word-leading mass spectrometry (MS)-
based proteomics data repository.
• It stores:
• Peptide and protein expression data (identification
and quantification)
• Post-translational modifications
• Mass spectra (raw data and peak lists)
• Technical and biological metadata
• Any other related information
• Any data workflow is now supported.
• Leading the global ProteomeXchange Consortium.
PRIDE (PRoteomics IDEntifications) Archive
http://www.ebi.ac.uk/pride/archive
Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2016
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
Why this project is timely?
Martens & Vizcaíno, Trends Bioch Sci, 2017 Data download from PRIDE in 2016: 243 TB
0
50
100
150
200
250
300
2013 2014 2015 2016
Downloads in TBs
• Open, reproducible, traceable and scalable analysis pipelines are
needed, as the size of proteomics datasets keeps growing.
• Reuse of public proteomics data is flourishing.
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
Aknowledgements: People
Mathias Walzer
Yasset Perez-Riverol
EMBL-EBI cloud team (led by Steven Newhouse)
Oliver Kohlbacher (Tuebingen University)
Martin Eisenacher (Bochum University)
Everyone who attended the workshop in Tuebingen
(March 1-2)
Do you want to get involved?
Acknowledgements
ELIXIR All Hands 2017, 21-23 March, Rome, Italy
www.hupo2017.ie
Abstract Deadline: 5th April
Early Registration Deadline: 14th June
Dublin 17-21st September

Enabling automated processing and analysis of large-scale proteomics data

  • 1.
    www.elixir-europe.org ELIXIR All Hands2017, 21-23 March, Rome, Italy Enabling automated processing and analysis of large-scale proteomics data Juan Antonio Vizcaíno, EMBL-EBI Hinxton, Cambridge
  • 2.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy One slide intro to MS based proteomics Hein et al., Handbook of Systems Biology, 2012
  • 3.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy Kickoff meeting for Proteomics activities in ELIXIR • It took place on March 1-2 2017, at Tuebingen, organised by ELIXIR-Germany. • ~25 people attending, representing 11 ELIXIR Nodes. • Outcome: White paper outlining the possible future activities related to proteomics in the context of ELIXIR. This paper will be submitted to F1000 Research. • To be available by end of May/ beginning of June.
  • 4.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy ELIXIR Implementation Project • 1-year project just started. Led by EMBL-EBI (Vizcaíno) and ELIXIR-Germany (Kohlbacher, Eisenacher). • Aim: Development of reproducible data analysis pipelines for shot-gun proteomics approaches using the OpenMS framework. • Deployment of the pipelines in the EMBL-EBI “Embassy cloud” as proof of concept: • Facilitate future deployment in other cloud environments. • Direct connection with public datasets in the PRIDE database.
  • 5.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy • PRIDE is the word-leading mass spectrometry (MS)- based proteomics data repository. • It stores: • Peptide and protein expression data (identification and quantification) • Post-translational modifications • Mass spectra (raw data and peak lists) • Technical and biological metadata • Any other related information • Any data workflow is now supported. • Leading the global ProteomeXchange Consortium. PRIDE (PRoteomics IDEntifications) Archive http://www.ebi.ac.uk/pride/archive Martens et al., Proteomics, 2005 Vizcaíno et al., NAR, 2016
  • 6.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy Why this project is timely? Martens & Vizcaíno, Trends Bioch Sci, 2017 Data download from PRIDE in 2016: 243 TB 0 50 100 150 200 250 300 2013 2014 2015 2016 Downloads in TBs • Open, reproducible, traceable and scalable analysis pipelines are needed, as the size of proteomics datasets keeps growing. • Reuse of public proteomics data is flourishing.
  • 7.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy Aknowledgements: People Mathias Walzer Yasset Perez-Riverol EMBL-EBI cloud team (led by Steven Newhouse) Oliver Kohlbacher (Tuebingen University) Martin Eisenacher (Bochum University) Everyone who attended the workshop in Tuebingen (March 1-2) Do you want to get involved? Acknowledgements
  • 8.
    ELIXIR All Hands2017, 21-23 March, Rome, Italy www.hupo2017.ie Abstract Deadline: 5th April Early Registration Deadline: 14th June Dublin 17-21st September