Your SlideShare is downloading. ×
Deep proteome and trancriptome mapping of human cervical cancer cell line
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Deep proteome and trancriptome mapping of human cervical cancer cell line

748

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
748
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • In order to understand something, sometime we need to look at the pieces in order to see the bigger picture. In attempting a system-wide understanding of a biological concept, we need an inventory of the system’s building blocks.
  • To understand human genes, the human genome sequence was elcucidated
  • According to the Central Dogma of Molecular Biology, the flow of information goes through the DNA which is transcribed into RNA and then translated into functional proteins.
  • also called "Whole Transcriptome Shotgun Sequencing" [1] ("WTSS") and dubbed "a revolutionary tool for transcriptomics",[2] refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA contentNext generation sequencing techniquesTranscriptome AlignmentDirect RNA sequencing
  • iBAQ (intensity based absolute quantification) - uses the total ion intensity for all of the peptides observed, normalized against the total theoretical number of observable peptidesFKPM (fragments per kilobase of exon per million fragments mapped)
  • Transcript

    • 1. Deep proteome and transcriptome mapping of a human cancer cell line BARBA | BIADOMANG |CUA | DAYANAN | LOPEZAUTHORS: N. Nagaraj, J. Wisniewski, T. Geiger, J. Cox, M. Kircher, J. Kelso, S. Paabo, M. MannJOURNAL: Molecular Systems BiologyDATE PUBLISHED: October 29, 2011
    • 2. Human genome is comprise of a mere 20,000protein coding genes.
    • 3. RNA-Seq Transcriptomics Transcripts between 8000-16000 of protein coding genes
    • 4. High-res MS-based ProteomicsEssentially completeproteome of model organism (yeast)Limited to 4000-6000 protein groups in mammalian systems
    • 5. “…explore a human proteome in the depthachievable with current technology and to compare it with the corresponding transcriptome.”
    • 6. METHODOLOGY
    • 7. HeLa Cells Lysate Store atFlash freezing -80oC Sonication Cell Lysis Protein content Centrifugation determination
    • 8. Protein Fractionation by Gel Filtration 0.1mL cell Load onto lysate GL Column Elution with buffer
    • 9. Protein Digestion and Peptide Fractionation Trypsin Removal of Protein LysC detergent Digestion Gluc
    • 10. Mass Spectroscopy RP C18Purification MS Chromatography
    • 11. RNA sequence RNA library Extraction Quantification Enrichment preparation Addition of Blunt ends RNA fragments Fragmentationdeoxyadenosine conversion copied into DNA Ligation of Amplify Sequencingforked adaptors
    • 12. Data AvailabilityGene and Transcript Quantification Data analysis
    • 13. RESULTS AND DISCUSSION
    • 14. Sample: The HeLa cells• HeLa cells – Human cervical carcinoma cell line • “Immortal cells”: can grow indefinitely, be frozen for decades • Standardized field of tissue culture – Named after Henriette Lacks by a scientist at John Hopkins Hospital • A piece of her tumor was taken • Her cells never died – Prolific growth maybe due to HPV18 • HPV18 viral proteins (E6 and E7) suppresses p53 and pRb gene products, respectively.
    • 15. Proteome Coverage Study• Objective: to achieve maximum proteome coverage at a reasonable measurement time – Procedure: Investigate effects of protein fractionation, proteolytic digestion, peptide fractionation, and reverse phase chromatography
    • 16. Proteome Coverage Study• Protein fractionation – Gel filtration: separation based on size and shape• Proteolytic digestion – Trypsin: C-terminal side of lysine and arginine – Glu-C: C-terminal of glutamic residues – Lys-C: carboxyl side of lysine residues – Note: Protein digestion heavily affects effective protein characterization and identification by mass spectrometry • Overlapping fragments = larger sequence coverage
    • 17. Proteome Coverage Study• Pipette-based prefractionations – Strong anion exchange resin • Independent of pH • Mostly used for deep coverage of the composition of the sample or if specific peptides should be enriched• Reverse phase chromatography – Reduced the complexity of the peptide mixture by selecting peptides for tandem mass spectrometry according to their polarity
    • 18. Proteome Coverage Study• LC-MS/MS analysis – Peptide MS spectra • Interpretation by comparison with lists from generated from theoretical digestion of protein – Fragment MS/MS spectra • Interpretation by comparison from theoretical fragmentation of peptide – Elution time of peptide is based on its polarity – Repeated extensively, in order to increase the number of peptides, thereby making the protein less complex, for which tandem mass spectra are acquired
    • 19. Proteome Coverage Study• Procedure is referred as “shotgun” proteomics – Most successful strategy to achieve extensive proteome coverafe – Summary: protein sample is extracted from their biological source, subjected to enzymatic digestion, the resulting peptide mixtures are analysed by LC-MS/MS • Additional augmented fractionation steps for proteins/peptides can also be conducted
    • 20. MaxQuant Computational Proteomics Environment• MaxQuant – Quantitative proteomics software package designed for analyzing large mass spectrometric data sets – Has an integrated search engine, Andromeda – Supported instrument: LTQ-Orbitrap • Orbitrap: ions circulate around a central, spindle- shaped electrode • Highly accurate: axial frequency oscillation, determined with high precision, is proportional to the square root of m/z.
    • 21. MaxQuant Computational Proteomics Environment• Number of runs: 2 337 336 high resolution fragmentation spectra and high-accuracy precursor masses• Search Engine: Andromeda – Algorithm: uses a probability based approach to match tandem mass (MS/MS) spectra to peptide sequences in databases – Median peptide score: 121, 6% below has a score of 60 • For each score, corresponds to the sum of the highest ions score for each distinct sequence
    • 22. MaxQuant Computational Proteomics Environment• Average identification of fragmentation spectra: 43%• Average absolute mass deviation of the precursors for the matched fragment masses: 1.2 and 4.8 p.p.m
    • 23. MaxQuant Computational Proteomics Environment• Result of analysis – Identified and quantified number of peptides • 163 784 peptides – FDR (false discovery rate): 1% – Out of 163 784 peptides • 84 051 from tryptic digestion • 52 108 from Lys-C digestion • 44 704 from Glu-C digestion
    • 24. MaxQuant Computational Proteomics Environment• Result of analysis – From obtained data, MaxQuant identified 10 255 proteins with 99% confidence • Lower bound of the number of proteins expressed in HeLa cells – There were observed overlapping fragments of enzymatic cleavage • Tryptic digestion: yielded highest number of identifications • Lys-C digewstion: 85% overlapped with Trypsin • Glu-C digestion: 5.2% novel identifications • Shows that <5% of all proteins were only identified by one peptide • Taken all together, >25% median sequence coverage
    • 25. ENSEMBL database and GENSCAN predictions• ENSEMBL-annotated human protein-coding genes – MS/MS spectra: searched against the ENSEMBL database with GENSCAN predictions – 10 255 proteins were mapped to 9207 human protein-coding genes • Most identified number of genes at chromosome 1 • Least number of identified genes at chromosome 21 – GENSCAN preidictions: >1900 peptides not known to ENSEMBL genes
    • 26. ZAB
    • 27. Completeness of Detected Proteome• Inspecting the macrocomplexes which are functionally necessary• Proteosome, spliceosome, histone modifying complexes and respiratory chain complexes• Corum protein complex database
    • 28. Corum protein complex database• Collection of experimentally verified mammalian protein complexes – protein complex function – Localization – subunit composition – literature references
    • 29. • Mean proteome coverage of all Corum protein was >95%• Transcriptome coverage 96.5%• Among the lower coverage which is due to cell type specificity are (next slide)
    • 30. Complex Normally Expressed % Coverage Sarcoglycan- Muscle 20 sarcospan SNARE (Soluble N- ethylmaleimide sensitive Neuronal tissue 40factor Attachment protein Receptor ) ITGA2b-ITGB3 Platelets 50• Sarcoglycan-sarcospan provides structural integrity in muscle tissues• SNARE for neurotransmitter release in synapses• ITGA2b-ITGB3 - a fibronectin receptor that plays a crucial role in coagulation
    • 31. • 5% of the HeLa cell population was in mitosis – 61/63 proteins in a reference set of cell cycle- specific proteins – High coverage of the most metabolic pathways pertaining to basic cellular function• Comprehensiveness of the proteome is hard to determine by comparison with pathway databases because they contain cell type- specific proteins
    • 32. Quantitative Analysis• Deep-sequencing transcriptomics – Proteomics data - >90% complete• Transciptome + proteome data – 10,000 - 12, 000 genes expressed in HeLa cells• iBAQ (intensity based absolute quantification) – incorporating individual peptide signals in MS and normalized by the number of observable peptides of the protein – Estimate the absolute amount of each protein
    • 33. Quantitative Analysis • 40 most abundant protein comprised 25% of the proteome – Filamin A, pyruvate kinase, enolase, vimentin, Hsp 60 • 600 proteins-> 75% of the HeLa cell proteome mass
    • 34. Quantitative Analysis• Contribution of each protein to the total mass in combination with the knowledge of number of cells in the initial sample – roughly estimate the absolute copy number of the proteins in HeLa cells
    • 35. Quantitative Analysis• Ranked distribution of proteins – 90% protein is within a range of a factor of 60 above or below the median protein copy number of 18, 000 molecules per cel – The lower half accounts for <2% of its total mass
    • 36. Quantitative Analysis• Protein abundance values – Used to estimate the proportional contribution of any: • individual protein, • protein complex and • protein class to the total proteome
    • 37. Quantitative Analysis• Ribosomes (encoded by only 1% human gene) – 195 proteins contributed 6% to total protein mass• Actin cytoskeleton contributes four-fold more to the proteome mass than expected from the number of genes and proteins
    • 38. Quantitative Analysis• Integral membrane – 25% of the genome – 7.6% protein mass
    • 39. Quantitative Analysis • Protein folding – 2% of the identified proteome by number – 8% of proteome mass
    • 40. Quantitative Analysis Percentage to the Total Mass “Protein folding” Integral membrane proteins Proteins Human genome 25 2 Protein Mass 7.6 8• Differences are due to cell-type specific functions of these proteins
    • 41. Structural proteinsand proteins in basicmachineries > Regulatory proteins
    • 42. Ribosome proteins form tight clusterat the top end
    • 43. Proteosome also abundant but not itsregulatory subunits (factor of 100 less)
    • 44. Cytoskeletal and metabolic proteinsextend over a broad range
    • 45. Enolase – highest expression valueGlycogen phosphorylase – 100,000-fold less at protein level and 10,000less at transcript level
    • 46. Regulatory proteins such as proteinkinases and transcription factors have,on average, lower expression than thestructural proteinsEach category spans a largeexpression rangeExpression levels can provide startingpoints for systems biologicalmodeling of the cell
    • 47. TRANSCIPTOME  RNA-SeqPROTEOME  High-res MS“Given the rapid technological progress in both fields, we predict that the requireddepth of 10,000–12,000 genes will be routinely reachable soon.”
    • 48. THANK YOU

    ×