Your SlideShare is downloading. ×
Deep proteome and trancriptome mapping of human cervical cancer cell line
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Deep proteome and trancriptome mapping of human cervical cancer cell line


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • In order to understand something, sometime we need to look at the pieces in order to see the bigger picture. In attempting a system-wide understanding of a biological concept, we need an inventory of the system’s building blocks.
  • To understand human genes, the human genome sequence was elcucidated
  • According to the Central Dogma of Molecular Biology, the flow of information goes through the DNA which is transcribed into RNA and then translated into functional proteins.
  • also called "Whole Transcriptome Shotgun Sequencing" [1] ("WTSS") and dubbed "a revolutionary tool for transcriptomics",[2] refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA contentNext generation sequencing techniquesTranscriptome AlignmentDirect RNA sequencing
  • iBAQ (intensity based absolute quantification) - uses the total ion intensity for all of the peptides observed, normalized against the total theoretical number of observable peptidesFKPM (fragments per kilobase of exon per million fragments mapped)
  • Transcript

    • 1. Deep proteome and transcriptome mapping of a human cancer cell line BARBA | BIADOMANG |CUA | DAYANAN | LOPEZAUTHORS: N. Nagaraj, J. Wisniewski, T. Geiger, J. Cox, M. Kircher, J. Kelso, S. Paabo, M. MannJOURNAL: Molecular Systems BiologyDATE PUBLISHED: October 29, 2011
    • 2. Human genome is comprise of a mere 20,000protein coding genes.
    • 3. RNA-Seq Transcriptomics Transcripts between 8000-16000 of protein coding genes
    • 4. High-res MS-based ProteomicsEssentially completeproteome of model organism (yeast)Limited to 4000-6000 protein groups in mammalian systems
    • 5. “…explore a human proteome in the depthachievable with current technology and to compare it with the corresponding transcriptome.”
    • 7. HeLa Cells Lysate Store atFlash freezing -80oC Sonication Cell Lysis Protein content Centrifugation determination
    • 8. Protein Fractionation by Gel Filtration 0.1mL cell Load onto lysate GL Column Elution with buffer
    • 9. Protein Digestion and Peptide Fractionation Trypsin Removal of Protein LysC detergent Digestion Gluc
    • 10. Mass Spectroscopy RP C18Purification MS Chromatography
    • 11. RNA sequence RNA library Extraction Quantification Enrichment preparation Addition of Blunt ends RNA fragments Fragmentationdeoxyadenosine conversion copied into DNA Ligation of Amplify Sequencingforked adaptors
    • 12. Data AvailabilityGene and Transcript Quantification Data analysis
    • 14. Sample: The HeLa cells• HeLa cells – Human cervical carcinoma cell line • “Immortal cells”: can grow indefinitely, be frozen for decades • Standardized field of tissue culture – Named after Henriette Lacks by a scientist at John Hopkins Hospital • A piece of her tumor was taken • Her cells never died – Prolific growth maybe due to HPV18 • HPV18 viral proteins (E6 and E7) suppresses p53 and pRb gene products, respectively.
    • 15. Proteome Coverage Study• Objective: to achieve maximum proteome coverage at a reasonable measurement time – Procedure: Investigate effects of protein fractionation, proteolytic digestion, peptide fractionation, and reverse phase chromatography
    • 16. Proteome Coverage Study• Protein fractionation – Gel filtration: separation based on size and shape• Proteolytic digestion – Trypsin: C-terminal side of lysine and arginine – Glu-C: C-terminal of glutamic residues – Lys-C: carboxyl side of lysine residues – Note: Protein digestion heavily affects effective protein characterization and identification by mass spectrometry • Overlapping fragments = larger sequence coverage
    • 17. Proteome Coverage Study• Pipette-based prefractionations – Strong anion exchange resin • Independent of pH • Mostly used for deep coverage of the composition of the sample or if specific peptides should be enriched• Reverse phase chromatography – Reduced the complexity of the peptide mixture by selecting peptides for tandem mass spectrometry according to their polarity
    • 18. Proteome Coverage Study• LC-MS/MS analysis – Peptide MS spectra • Interpretation by comparison with lists from generated from theoretical digestion of protein – Fragment MS/MS spectra • Interpretation by comparison from theoretical fragmentation of peptide – Elution time of peptide is based on its polarity – Repeated extensively, in order to increase the number of peptides, thereby making the protein less complex, for which tandem mass spectra are acquired
    • 19. Proteome Coverage Study• Procedure is referred as “shotgun” proteomics – Most successful strategy to achieve extensive proteome coverafe – Summary: protein sample is extracted from their biological source, subjected to enzymatic digestion, the resulting peptide mixtures are analysed by LC-MS/MS • Additional augmented fractionation steps for proteins/peptides can also be conducted
    • 20. MaxQuant Computational Proteomics Environment• MaxQuant – Quantitative proteomics software package designed for analyzing large mass spectrometric data sets – Has an integrated search engine, Andromeda – Supported instrument: LTQ-Orbitrap • Orbitrap: ions circulate around a central, spindle- shaped electrode • Highly accurate: axial frequency oscillation, determined with high precision, is proportional to the square root of m/z.
    • 21. MaxQuant Computational Proteomics Environment• Number of runs: 2 337 336 high resolution fragmentation spectra and high-accuracy precursor masses• Search Engine: Andromeda – Algorithm: uses a probability based approach to match tandem mass (MS/MS) spectra to peptide sequences in databases – Median peptide score: 121, 6% below has a score of 60 • For each score, corresponds to the sum of the highest ions score for each distinct sequence
    • 22. MaxQuant Computational Proteomics Environment• Average identification of fragmentation spectra: 43%• Average absolute mass deviation of the precursors for the matched fragment masses: 1.2 and 4.8 p.p.m
    • 23. MaxQuant Computational Proteomics Environment• Result of analysis – Identified and quantified number of peptides • 163 784 peptides – FDR (false discovery rate): 1% – Out of 163 784 peptides • 84 051 from tryptic digestion • 52 108 from Lys-C digestion • 44 704 from Glu-C digestion
    • 24. MaxQuant Computational Proteomics Environment• Result of analysis – From obtained data, MaxQuant identified 10 255 proteins with 99% confidence • Lower bound of the number of proteins expressed in HeLa cells – There were observed overlapping fragments of enzymatic cleavage • Tryptic digestion: yielded highest number of identifications • Lys-C digewstion: 85% overlapped with Trypsin • Glu-C digestion: 5.2% novel identifications • Shows that <5% of all proteins were only identified by one peptide • Taken all together, >25% median sequence coverage
    • 25. ENSEMBL database and GENSCAN predictions• ENSEMBL-annotated human protein-coding genes – MS/MS spectra: searched against the ENSEMBL database with GENSCAN predictions – 10 255 proteins were mapped to 9207 human protein-coding genes • Most identified number of genes at chromosome 1 • Least number of identified genes at chromosome 21 – GENSCAN preidictions: >1900 peptides not known to ENSEMBL genes
    • 26. ZAB
    • 27. Completeness of Detected Proteome• Inspecting the macrocomplexes which are functionally necessary• Proteosome, spliceosome, histone modifying complexes and respiratory chain complexes• Corum protein complex database
    • 28. Corum protein complex database• Collection of experimentally verified mammalian protein complexes – protein complex function – Localization – subunit composition – literature references
    • 29. • Mean proteome coverage of all Corum protein was >95%• Transcriptome coverage 96.5%• Among the lower coverage which is due to cell type specificity are (next slide)
    • 30. Complex Normally Expressed % Coverage Sarcoglycan- Muscle 20 sarcospan SNARE (Soluble N- ethylmaleimide sensitive Neuronal tissue 40factor Attachment protein Receptor ) ITGA2b-ITGB3 Platelets 50• Sarcoglycan-sarcospan provides structural integrity in muscle tissues• SNARE for neurotransmitter release in synapses• ITGA2b-ITGB3 - a fibronectin receptor that plays a crucial role in coagulation
    • 31. • 5% of the HeLa cell population was in mitosis – 61/63 proteins in a reference set of cell cycle- specific proteins – High coverage of the most metabolic pathways pertaining to basic cellular function• Comprehensiveness of the proteome is hard to determine by comparison with pathway databases because they contain cell type- specific proteins
    • 32. Quantitative Analysis• Deep-sequencing transcriptomics – Proteomics data - >90% complete• Transciptome + proteome data – 10,000 - 12, 000 genes expressed in HeLa cells• iBAQ (intensity based absolute quantification) – incorporating individual peptide signals in MS and normalized by the number of observable peptides of the protein – Estimate the absolute amount of each protein
    • 33. Quantitative Analysis • 40 most abundant protein comprised 25% of the proteome – Filamin A, pyruvate kinase, enolase, vimentin, Hsp 60 • 600 proteins-> 75% of the HeLa cell proteome mass
    • 34. Quantitative Analysis• Contribution of each protein to the total mass in combination with the knowledge of number of cells in the initial sample – roughly estimate the absolute copy number of the proteins in HeLa cells
    • 35. Quantitative Analysis• Ranked distribution of proteins – 90% protein is within a range of a factor of 60 above or below the median protein copy number of 18, 000 molecules per cel – The lower half accounts for <2% of its total mass
    • 36. Quantitative Analysis• Protein abundance values – Used to estimate the proportional contribution of any: • individual protein, • protein complex and • protein class to the total proteome
    • 37. Quantitative Analysis• Ribosomes (encoded by only 1% human gene) – 195 proteins contributed 6% to total protein mass• Actin cytoskeleton contributes four-fold more to the proteome mass than expected from the number of genes and proteins
    • 38. Quantitative Analysis• Integral membrane – 25% of the genome – 7.6% protein mass
    • 39. Quantitative Analysis • Protein folding – 2% of the identified proteome by number – 8% of proteome mass
    • 40. Quantitative Analysis Percentage to the Total Mass “Protein folding” Integral membrane proteins Proteins Human genome 25 2 Protein Mass 7.6 8• Differences are due to cell-type specific functions of these proteins
    • 41. Structural proteinsand proteins in basicmachineries > Regulatory proteins
    • 42. Ribosome proteins form tight clusterat the top end
    • 43. Proteosome also abundant but not itsregulatory subunits (factor of 100 less)
    • 44. Cytoskeletal and metabolic proteinsextend over a broad range
    • 45. Enolase – highest expression valueGlycogen phosphorylase – 100,000-fold less at protein level and 10,000less at transcript level
    • 46. Regulatory proteins such as proteinkinases and transcription factors have,on average, lower expression than thestructural proteinsEach category spans a largeexpression rangeExpression levels can provide startingpoints for systems biologicalmodeling of the cell
    • 47. TRANSCIPTOME  RNA-SeqPROTEOME  High-res MS“Given the rapid technological progress in both fields, we predict that the requireddepth of 10,000–12,000 genes will be routinely reachable soon.”
    • 48. THANK YOU