Your SlideShare is downloading. ×
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
RNA-seq for DE analysis: the biology behind observed changes - part 6
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

RNA-seq for DE analysis: the biology behind observed changes - part 6

494

Published on

Part 6 of the training sesson 'RNA-seq for differential expression analysis' considers gene set analysis for inferring biology from RNA-seq data. See http://www.bits.vib.be

Part 6 of the training sesson 'RNA-seq for differential expression analysis' considers gene set analysis for inferring biology from RNA-seq data. See http://www.bits.vib.be

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
494
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
38
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. The biology behind expression differences RNA-seq for DE analysis training Joachim Jacob 20 and 27 January 2014 This presentation is available under the Creative Commons Attribution-ShareAlike 3.0 Unported License. Please refer to http://www.bits.vib.be/ if you use this presentation or parts hereof.
  • 2. Overview http://www.nature.com/nprot/journal/v8/n9/full/nprot.2013.099.html
  • 3. Analyzing the DE analysis results The 'detect differential expression' tool gives you four results: the first is the report including graphs. Only lower than cut-off and with indep filtering. All genes, with indep filtering applied. Complete DESeq results, without indep filtering applied.
  • 4. Analyzing the DE analysis results Only lower than cut-off and with indep filtering. All genes, with indep filtering applied. Complete DESeq results, without indep filtering applied.
  • 5. Setting a cut-off You choose a cut-off! You can go over the genes one by one, and look for 'interesting' genes, and try to link it to the experimental conditions. Alternative: we can take all genes, ranked by their p-value (which stands a 'level of surprise'). Pro: we don't need our arbitrary cut-off.
  • 6. Analysis of the list of DE genes All genes (6666 yeast genes) Genes sensible to test (filtered out 10% of the lowest genes) (5830 yeast genes) DE genes with p-value cut-off of 0,01 (637 genes)
  • 7. Gene set enrichment ● We use the knowledge already available on biology. We construct list of genes for: ● Pathways ● Biological processes ● Cellular components ● Molecular functions ● Transcription binding sites ● ... http://wiki.bits.vib.be/index.php/Gene_set_enrichment_analysis
  • 8. Getting lists of genes ● Gene Ontology consortium ● Reactome:
  • 9. A many-to-many relation Linking gene IDs to molecular function. … to binding partners ... to transcription factor binding sites.
  • 10. Biomart can help you fetch sets
  • 11. Biomart can help you
  • 12. Contingency approach DE results Gene set 1 637/5830 15/56 Equal? (hypergeometric test)
  • 13. Contingency approach DE results Gene set 2 637/5830 5/30
  • 14. Contingency approach DE results Gene set 3 637/5830 34/78 ! Gene set enriched
  • 15. Artificial? DE results But cut-off remains artificial, arbitrarily chosen. Rerun with different cut-off: you will detect other significant sets! The background needs to be carefully chosen. This approach favors gene sets with genes whose expression differs a lot ('high level of surprise', p-value).
  • 16. Contingency table approach tools http://wiki.bits.vib.be/index.php/Gene_set_enrichment_analysis
  • 17. Cut-off free approach No cut-off needs to be chosen using GSEA and derived methods! We take into account all genes for which we get a reliable p-value. (see the p-value histogram chart). The genes are sorted/ranked according to 'level of surprise', i.e. by their p-value. (other options are test-statistics (T,...))
  • 18. Intuition of GSEA Gene set 1 Running sum: Every occurrence increases the sum, every absence decreases the sum. The maximum is the MES, the final score 0 p-value 1 Mootha et al. http://www.nature.com/ng/journal/v34/n3/full/ng1180.html
  • 19. Intuition of GSEA Gene set 2 Higher running sum MES Gene set 3 Median running sum MES Low running sum MES Gene set 4 The scores are compared to permutated/shuffled gene set (sample label versus gene label permutation). 0 p-value 1
  • 20. Cut-off free approach The advantages: ● Robustness about mapping errors influencing counts ● The set can be detected even if some genes are not present. ● Tolerance if gene set contains incorrect genes. ● Strong signal if all genes are only seemingly lightly overexpressed.
  • 21. With cut-off applied Genes involved in oxidative phosphorylation Significant DE genes (p-value <0,05) Mootha et al. http://www.nature.com/ng/journal/v34/n3/full/ng1180.html
  • 22. Cut-off free approach Genes involved in oxidative phosphorylation are nearly all slightly overexpressed. This can be detected by gene set analysis. Mootha et al. http://www.nature.com/ng/journal/v34/n3/full/ng1180.html
  • 23. GSEA has inspired others. Different methods exist to rank the genes, to calculate the running sum, and to check significance of the running sum. In addition, directionality of the changes can be incorporated. Varemo et al. http://nar.oxfordjournals.org/content/early/2013/02/26/nar.gkt111
  • 24. GSEA has inspired many Piano SPIA
  • 25. Piano provides a consensus output Piano has combined different methods and calculates a consensus score. It does this for 5 different types of 'directionality classes'. The main output is a heatmap with gene set significantly enriched, depleted or just changed. The sets Ranks! Lower is 'more important'
  • 26. Piano provides a consensus output 1) distinct-directional down: gene set as a whole is downregulated. 2) mixed-directional down: A subset of the set is significantly downregulated 3) non-directional: the set is enriched in significant DE genes without taking into account directionality. 4) mixed-directional up: A subset of the set is significantly upregulated 5) distinct-directional up: gene set as a whole is upregulated.
  • 27. Keywords Gene set Contingency approach T-statistic P-value histogram GSEA heatmap Directionality of expression changes Write in your own words what the terms mean
  • 28. Break

×