SlideShare a Scribd company logo
Introduction
                  Normalization
          Differential Expression




   Differential Expression in RNA-Seq


                      Sonia Tarazona
                     starazona@cipf.es




Data analysis workshop for massive sequencing data
                Granada, June 2011

                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction
                                  Normalization
                          Differential Expression



Outline



  1   Introduction



  2   Normalization



  3   Differential Expression




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                                  Normalization    Some definitions
                          Differential Expression   RNA-seq expression data



Outline

  1   Introduction

        Some questions

        Some definitions

        RNA-seq expression data



  2   Normalization



  3   Differential Expression




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                               Normalization    Some definitions
                       Differential Expression   RNA-seq expression data



Some questions


      How do we measure expression?


      What is differential expression?


      Experimental design in RNA-Seq


      Can we used the same statistics as in microarrays?


      Do I need any “normalization”?




                             Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                                  Normalization    Some definitions
                          Differential Expression   RNA-seq expression data



Some definitions
   Expression level
         RNA-Seq: The number of reads (counts) mapping to the biological feature of
                  interest (gene, transcript, exon, etc.) is considered to be linearly
                  related to the abundance of the target feature.
      Microarrays: The abundance of each sequence is a function of the fluorescence
                   level recovered after the hybridization process.




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                              Normalization    Some definitions
                      Differential Expression   RNA-seq expression data



Some definitions
      Sequencing depth: Total number of reads mapped to the genome. Library size.


      Gene length: Number of bases.


      Gene counts: Number of reads mapping to that gene (expression measurement).




                            Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                                  Normalization    Some definitions
                          Differential Expression   RNA-seq expression data



Some definitions



   What is differential expression?
         A gene is declared differentially expressed if an observed difference or change in
         read counts between two experimental conditions is statistically significant, i.e.
         whether it is greater than what would be expected just due to natural random
         variation.

         Statistical tools are needed to make such a decision by studying counts
         probability distributions.




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Some questions
                                  Normalization    Some definitions
                          Differential Expression   RNA-seq expression data



RNA-Seq expression data


   Experimental design
         Pairwise comparisons: Only two experimental conditions or groups are to be
         compared.
         Multiple comparisons: More than two conditions or groups.

   Replication
         Biological replicates. To draw general conclusions: from samples to population.
         Technical replicates. Conclusions are only valid for compared samples.




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                                  Normalization    Methods
                          Differential Expression   Examples



Outline

  1   Introduction



  2   Normalization

        Why?

        Methods

        Examples



  3   Differential Expression




                                Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                               Normalization    Methods
                       Differential Expression   Examples



Why Normalization?
  RNA-seq biases
       Influence of sequencing depth: The higher sequencing depth, the higher counts.




                             Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                                Normalization    Methods
                        Differential Expression   Examples



Why Normalization?
  RNA-seq biases


       Dependence on gene length: Counts are proportional to the transcript length
       times the mRNA expression level.




                              Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                               Normalization    Methods
                       Differential Expression   Examples



Why Normalization?

  RNA-seq biases




       Differences on the counts distribution among samples.




                             Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                                 Normalization    Methods
                         Differential Expression   Examples



Why Normalization?

  RNA-seq biases
        Influence of sequencing depth: The higher sequencing depth, the higher counts.
        Dependence on gene length: Counts are proportional to the transcript length
        times the mRNA expression level.
        Differences on the counts distribution among samples.



  Options
    1   Normalization: Counts should be previously corrected in order to minimize these
        biases.
    2   Statistical model should take them into account.




                               Sonia Tarazona     Differential Expression in RNA-Seq
Introduction    Why?
                                Normalization     Methods
                        Differential Expression    Examples



Some Normalization Methods
      RPKM (Mortazavi et al., 2008): Counts are divided by the transcript length
      (kb) times the total number of millions of mapped reads.

                                        number of reads of the region
                           RPKM =
                                         total reads × region length
                                            1000000            1000


      Upper-quartile (Bullard et al., 2010): Counts are divided by upper-quartile of
      counts for transcripts with at least one read.

      TMM (Robinson and Oshlack, 2010): Trimmed Mean of M values.

      Quantiles, as in microarray normalization (Irizarry et al., 2003).

      FPKM (Trapnell et al., 2010): Instead of counts, Cufflinks software generates
      FPKM values (Fragments Per Kilobase of exon per Million fragments mapped)
      to estimate gene expression, which are analogous to RPKM.



                              Sonia Tarazona      Differential Expression in RNA-Seq
Introduction   Why?
                       Normalization    Methods
               Differential Expression   Examples



Example on RPKM normalization


                                  Sequencing depth
                                  Marioni et al., 2008
                                        Kidney: 9.293.530 reads
                                        Liver: 8.361.601 reads

                                  RPKM normalization
                                        Length: 1500 bases.
                                                                   620
                                        Kidney: RPKM =        9293530 × 1500   = 44,48
                                                                106     1000
                                                                 746
                                        Liver: RPKM =       8361601 × 1500   = 59,48
                                                              106     1000




                     Sonia Tarazona     Differential Expression in RNA-Seq
Introduction   Why?
                  Normalization    Methods
          Differential Expression   Examples



Example




                Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                    Introduction
                                                   NOISeq
                                  Normalization
                                                   Exercises
                          Differential Expression
                                                   Concluding


Outline

  1   Introduction



  2   Normalization



  3   Differential Expression

        Methods

        NOISeq

        Exercises

        Concluding


                                Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                     Introduction
                                                    NOISeq
                                   Normalization
                                                    Exercises
                           Differential Expression
                                                    Concluding


Differential Expression

   Parametric approaches
   Counts are modeled using known probability distributions such as Binomial, Poisson,
   Negative Binomial, etc.

   R packages in Bioconductor:
         edgeR (Robinson et al., 2010): Exact test based on Negative Binomial
         distribution.
         DESeq (Anders and Huber, 2010): Exact test based on Negative Binomial
         distribution.
         DEGseq (Wang et al., 2010): MA-plots based methods (MATR and MARS),
         assuming Normal distribution for M|A.
         baySeq (Hardcastle et al., 2010): Estimation of the posterior likelihood of
         differential expression (or more complex hypotheses) via empirical Bayesian
         methods using Poisson or NB distributions.




                                 Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                    NOISeq
                                 Normalization
                                                    Exercises
                         Differential Expression
                                                    Concluding


Differential Expression

   Non-parametric approaches
   No assumptions about data distribution are made.


        Fisher’s exact test (better with normalized counts).




        cuffdiff (Trapnell et al., 2010): Based on entropy divergence for relative
        transcript abundances. Divergence is a measurement of the ”distance”between
        the relative abundances of transcripts in two difference conditions.
        NOISeq (Tarazona et al.,   coming soon)   −→ http://bioinfo.cipf.es/noiseq/




                               Sonia Tarazona       Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                  NOISeq
                                 Normalization
                                                  Exercises
                         Differential Expression
                                                  Concluding


Differential Expression


   Drawbacks of differential expression methods
        Parametric assumptions: Are they fulfilled?
        Need of replicates.
        Problems to detect differential expression in genes with low counts.




                               Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                  NOISeq
                                 Normalization
                                                  Exercises
                         Differential Expression
                                                  Concluding


Differential Expression


   Drawbacks of differential expression methods
        Parametric assumptions: Are they fulfilled?
        Need of replicates.
        Problems to detect differential expression in genes with low counts.

   NOISeq
        Non-parametric method.
        No need of replicates.
        Less influenced by sequencing depth or number of counts.




                                 Sonia Tarazona   Differential Expression in RNA-Seq
Methods
                                  Introduction
                                                 NOISeq
                                Normalization
                                                 Exercises
                        Differential Expression
                                                 Concluding


NOISeq
  Outline
       Signal distribution. Computing changes in expression of each gene between the
       two experimental conditions.
       Noise distribution. Distribution of changes in expression values when comparing
       replicates within the same condition.
       Differential expression. Comparing signal and noise distributions to determine
       differentially expressed genes.




                              Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                  NOISeq
                                 Normalization
                                                  Exercises
                         Differential Expression
                                                  Concluding


NOISeq

  NOISeq-real
       Replicates are available for each condition.
       Compute M-D in noise by comparing each pair of replicates within the same
       condition.




                               Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                  NOISeq
                                 Normalization
                                                  Exercises
                         Differential Expression
                                                  Concluding


NOISeq

  NOISeq-real
       Replicates are available for each condition.
       Compute M-D in noise by comparing each pair of replicates within the same
       condition.

  NOISeq-sim
       No replicates are available at all.
       NOISeq simulates technical replicates for each condition. The replicates are
       generated from a multinomial distribution taking the counts in the only sample
       as the probabilities for the distribution.
       Compute M-D in noise by comparing each pair of simulated replicates within the
       same condition.




                               Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                    Introduction
                                                   NOISeq
                                  Normalization
                                                   Exercises
                          Differential Expression
                                                   Concluding


NOISeq: Signal distribution
     1   Calculate the expression of each gene at each experimental condition
         (expressioni , for i = 1, 2).
               Technical replicates: expressioni = sum(valuesi )
               Biological replicates: expressioni = mean(valuesi )
               No replicates: expressioni = valuei
     2   Compute for each gene the statistics measuring changes in expression:
                  expression
         M = log2 expression1
                             2
         D = |expression1 –expression2 |




                                Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                 Introduction
                                                NOISeq
                               Normalization
                                                Exercises
                       Differential Expression
                                                Concluding


NOISeq: Noise distribution
      With replicates (NOISeq-real): For each condition, all the possible comparisons
      among the available replicates are used to compute M and D values. All the
      M-D values for all the comparisons, genes and for both conditions are pooled
      together to create the noise distribution.
      If the number of pairwise comparisons for a certain condition is higher than 30,
      only 30 randomly selected comparisons are made.
      Without replicates (NOISeq-sim): Replicates are simulated and the same
      procedure is used to derive noise distribution.




                             Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                  Introduction
                                                 NOISeq
                                Normalization
                                                 Exercises
                        Differential Expression
                                                 Concluding


NOISeq: Differential expression
      Probability for each gene of being differentially expressed: It is obtained by
      comparing M-D values of that gene against noise distribution and computing the
      number of cases in which the values in noise are lower than the values for signal.
      A gene is declared as differentially expressed if this probability is higher than q.
      The threshold q is set to 0.8 by default, since this value is equivalent to an odds
      of 4:1 (the gene in 4 times more likely to be differentially expressed than not).




                              Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                      Introduction
                                                     NOISeq
                                    Normalization
                                                     Exercises
                            Differential Expression
                                                     Concluding


NOISeq


  Input
          Data: datos1, datos2
          Features length: long (only if length correction is to be applied)
          Normalization: norm = {“rpkm”, “uqua”, “tmm”, “none”}; lc = “length
          correction”
          Replicates: repl = {“tech”, “bio”}
          Simulation: nss = “number of replicates to be simulated”; pnr = “total counts
          in each simulated replicate”; v = variability for pnr
          Probability cutoff: q (≥ 0,8)
          Others: k = 0,5




                                  Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                   Introduction
                                                  NOISeq
                                 Normalization
                                                  Exercises
                         Differential Expression
                                                  Concluding


NOISeq



  Output
       Differential expression probability. For each feature, probability of being
       differentially expressed.
       Differentially expressed features. List of features names which are differentially
       expressed according to q cutoff.
       M-D values. For signal (between conditions and for each feature) and for noise
       (among replicates within the same condition, pooled).




                               Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                     Introduction
                                                    NOISeq
                                   Normalization
                                                    Exercises
                           Differential Expression
                                                    Concluding


Exercises
   Execute in an R terminal the code provided in exerciseDEngs.r

   Reading data
   simCount <- readData(file = "simCount.txt", cond1 = c(2:6),
   cond2 = c(7:11), header = TRUE)
   lapply(simCount, head)
   depth <- as.numeric(sapply(simCount, colSums))

   Differential Expression by NOISeq
   NOISeq-real
   res1noiseq <- noiseq(simCount[[1]], simCount[[2]], nss = 0, q = 0.8,
   repl = ‘‘tech’’)
   NOISeq-sim
   res2noiseq <- noiseq(as.matrix(rowSums(simCount[[1]])),
   as.matrix(rowSums(simCount[[2]])), q = 0.8, pnr = 0.2, nss = 5)


   See the tutorial in http://bioinfo.cipf.es/noiseq/ for more information.

                                 Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                 Introduction
                                                NOISeq
                               Normalization
                                                Exercises
                       Differential Expression
                                                Concluding


Some remarks



      Experimental design is decisive to answer correctly your biological questions.


      Differential expression methods for RNA-Seq data must be different to
      microarray methods.


      Normalization should be applied to raw counts, at least a library size correction.




                             Sonia Tarazona     Differential Expression in RNA-Seq
Methods
                                  Introduction
                                                 NOISeq
                                Normalization
                                                 Exercises
                        Differential Expression
                                                 Concluding


For Further Reading


      Oshlack, A., Robinson, M., and Young, M. (2010) From RNA-seq reads to
      differential expression results. Genome Biology , 11, 220+.
      Review on RNA-Seq, including differential expression.
      Auer, P.L., and Doerge R.W. (2010) Statistical Design and Analysis of RNA
      Sequencing Data. Genetics, 185, 405-416.
      Bullard, J. H., Purdom, E., Hansen, K. D., and Dudoit, S. (2010) Evaluation of
      statistical methods for normalization and differential expression in mRNA-Seq
      experiments. BMC Bioinformatics, 11, 94+.
      Normalization methods (including Upper Quartile) and differential expression.
      Tarazona, S., Garc´
                        ıa-Alcalde, F., Dopazo, J., Ferrer A., and Conesa, A.
      (submitted) Differential expression in RNA-seq: a matter of depth.




                              Sonia Tarazona     Differential Expression in RNA-Seq

More Related Content

What's hot

Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
Alireza Doustmohammadi
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
University of California, Davis
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
BITS
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
cursoNGS
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysis
Junsu Ko
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
BITS
 
Single-cell RNA-seq tutorial
Single-cell RNA-seq tutorialSingle-cell RNA-seq tutorial
Single-cell RNA-seq tutorial
Aaron Diaz
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
Bioinformatics and Computational Biosciences Branch
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
Jatinder Singh
 
Clinical Applications of Next Generation Sequencing
Clinical Applications of Next Generation SequencingClinical Applications of Next Generation Sequencing
Clinical Applications of Next Generation Sequencing
Bell Symposium &amp; MSP Seminar
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genomeNaveen Gupta
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
Maté Ongenaert
 
Introduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seqIntroduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seq
Timothy Tickle
 
Applications of Single Cell Analysis
Applications of Single  Cell AnalysisApplications of Single  Cell Analysis
Applications of Single Cell Analysis
QIAGEN
 
Next generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvementNext generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvement
anjaligoud
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
Bioinformatics and Computational Biosciences Branch
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
Joachim Jacob
 

What's hot (20)

Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
 
RNAseq Analysis
RNAseq AnalysisRNAseq Analysis
RNAseq Analysis
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysis
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
 
Single-cell RNA-seq tutorial
Single-cell RNA-seq tutorialSingle-cell RNA-seq tutorial
Single-cell RNA-seq tutorial
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Clinical Applications of Next Generation Sequencing
Clinical Applications of Next Generation SequencingClinical Applications of Next Generation Sequencing
Clinical Applications of Next Generation Sequencing
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genome
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
 
Introduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seqIntroduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seq
 
Applications of Single Cell Analysis
Applications of Single  Cell AnalysisApplications of Single  Cell Analysis
Applications of Single Cell Analysis
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
Next generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvementNext generation sequencing technologies for crop improvement
Next generation sequencing technologies for crop improvement
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
 

Similar to Differential expression in RNA-Seq

RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
BITS
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
fruitbreedomics
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
ARUNDHATI MEHTA
 
nextgenerationsequencing-170606100132.pdf
nextgenerationsequencing-170606100132.pdfnextgenerationsequencing-170606100132.pdf
nextgenerationsequencing-170606100132.pdf
AkhileshPathak33
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
Manjappa Ganiger
 
Gene expression profiling
Gene expression profilingGene expression profiling
Gene expression profiling
PriyankaPriyanka63
 
16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence AnalysisAbdulrahman Muhammad
 
Bacterial rna sequencing
Bacterial rna sequencingBacterial rna sequencing
Bacterial rna sequencing
Dynah Perry
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_PresentationToyin23
 
Molecular marker by anil bl gather
Molecular marker by anil bl gatherMolecular marker by anil bl gather
Molecular marker by anil bl gather
ANIL BL GATHER
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02t7260678
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
Sean Davis
 
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
Shiv Kalia
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
 
NARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptxNARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptx
NARASIMHAMURTHY167
 
Nucleic acid and Molecular markers
Nucleic acid and Molecular markersNucleic acid and Molecular markers
Nucleic acid and Molecular markers
Sayali Magar
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
VHIR Vall d’Hebron Institut de Recerca
 
Microarray @ujjwal sirohi
Microarray @ujjwal sirohiMicroarray @ujjwal sirohi
Microarray @ujjwal sirohi
ujjwal sirohi
 
Transcriptomics approaches
Transcriptomics approachesTranscriptomics approaches
Transcriptomics approaches
CharupriyaChauhan1
 

Similar to Differential expression in RNA-Seq (20)

RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
nextgenerationsequencing-170606100132.pdf
nextgenerationsequencing-170606100132.pdfnextgenerationsequencing-170606100132.pdf
nextgenerationsequencing-170606100132.pdf
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
Gene expression profiling
Gene expression profilingGene expression profiling
Gene expression profiling
 
16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis
 
Bacterial rna sequencing
Bacterial rna sequencingBacterial rna sequencing
Bacterial rna sequencing
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_Presentation
 
Molecular marker by anil bl gather
Molecular marker by anil bl gatherMolecular marker by anil bl gather
Molecular marker by anil bl gather
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
 
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Rna seq
Rna seqRna seq
Rna seq
 
NARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptxNARASIMHA MURTHY. SNPs.pptx
NARASIMHA MURTHY. SNPs.pptx
 
Nucleic acid and Molecular markers
Nucleic acid and Molecular markersNucleic acid and Molecular markers
Nucleic acid and Molecular markers
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Microarray @ujjwal sirohi
Microarray @ujjwal sirohiMicroarray @ujjwal sirohi
Microarray @ujjwal sirohi
 
Transcriptomics approaches
Transcriptomics approachesTranscriptomics approaches
Transcriptomics approaches
 

More from cursoNGS

Towards an understanding of diversity in biological and biomedical systems
Towards an understanding of diversity in biological and biomedical systemsTowards an understanding of diversity in biological and biomedical systems
Towards an understanding of diversity in biological and biomedical systems
cursoNGS
 
Utilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humanaUtilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humana
cursoNGS
 
NGS analysis of micro-RNA
NGS analysis of micro-RNANGS analysis of micro-RNA
NGS analysis of micro-RNA
cursoNGS
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
cursoNGS
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data Preprocessing
cursoNGS
 
Computational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysisComputational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysis
cursoNGS
 
Linux for bioinformatics
Linux for bioinformaticsLinux for bioinformatics
Linux for bioinformatics
cursoNGS
 
Introduccion a la bioinformatica
Introduccion a la bioinformaticaIntroduccion a la bioinformatica
Introduccion a la bioinformatica
cursoNGS
 

More from cursoNGS (8)

Towards an understanding of diversity in biological and biomedical systems
Towards an understanding of diversity in biological and biomedical systemsTowards an understanding of diversity in biological and biomedical systems
Towards an understanding of diversity in biological and biomedical systems
 
Utilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humanaUtilidad de la genómica en la salud humana
Utilidad de la genómica en la salud humana
 
NGS analysis of micro-RNA
NGS analysis of micro-RNANGS analysis of micro-RNA
NGS analysis of micro-RNA
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data Preprocessing
 
Computational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysisComputational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysis
 
Linux for bioinformatics
Linux for bioinformaticsLinux for bioinformatics
Linux for bioinformatics
 
Introduccion a la bioinformatica
Introduccion a la bioinformaticaIntroduccion a la bioinformatica
Introduccion a la bioinformatica
 

Recently uploaded

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Recently uploaded (20)

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

Differential expression in RNA-Seq

  • 1. Introduction Normalization Differential Expression Differential Expression in RNA-Seq Sonia Tarazona starazona@cipf.es Data analysis workshop for massive sequencing data Granada, June 2011 Sonia Tarazona Differential Expression in RNA-Seq
  • 2. Introduction Normalization Differential Expression Outline 1 Introduction 2 Normalization 3 Differential Expression Sonia Tarazona Differential Expression in RNA-Seq
  • 3. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data Outline 1 Introduction Some questions Some definitions RNA-seq expression data 2 Normalization 3 Differential Expression Sonia Tarazona Differential Expression in RNA-Seq
  • 4. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data Some questions How do we measure expression? What is differential expression? Experimental design in RNA-Seq Can we used the same statistics as in microarrays? Do I need any “normalization”? Sonia Tarazona Differential Expression in RNA-Seq
  • 5. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data Some definitions Expression level RNA-Seq: The number of reads (counts) mapping to the biological feature of interest (gene, transcript, exon, etc.) is considered to be linearly related to the abundance of the target feature. Microarrays: The abundance of each sequence is a function of the fluorescence level recovered after the hybridization process. Sonia Tarazona Differential Expression in RNA-Seq
  • 6. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data Some definitions Sequencing depth: Total number of reads mapped to the genome. Library size. Gene length: Number of bases. Gene counts: Number of reads mapping to that gene (expression measurement). Sonia Tarazona Differential Expression in RNA-Seq
  • 7. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data Some definitions What is differential expression? A gene is declared differentially expressed if an observed difference or change in read counts between two experimental conditions is statistically significant, i.e. whether it is greater than what would be expected just due to natural random variation. Statistical tools are needed to make such a decision by studying counts probability distributions. Sonia Tarazona Differential Expression in RNA-Seq
  • 8. Introduction Some questions Normalization Some definitions Differential Expression RNA-seq expression data RNA-Seq expression data Experimental design Pairwise comparisons: Only two experimental conditions or groups are to be compared. Multiple comparisons: More than two conditions or groups. Replication Biological replicates. To draw general conclusions: from samples to population. Technical replicates. Conclusions are only valid for compared samples. Sonia Tarazona Differential Expression in RNA-Seq
  • 9. Introduction Why? Normalization Methods Differential Expression Examples Outline 1 Introduction 2 Normalization Why? Methods Examples 3 Differential Expression Sonia Tarazona Differential Expression in RNA-Seq
  • 10. Introduction Why? Normalization Methods Differential Expression Examples Why Normalization? RNA-seq biases Influence of sequencing depth: The higher sequencing depth, the higher counts. Sonia Tarazona Differential Expression in RNA-Seq
  • 11. Introduction Why? Normalization Methods Differential Expression Examples Why Normalization? RNA-seq biases Dependence on gene length: Counts are proportional to the transcript length times the mRNA expression level. Sonia Tarazona Differential Expression in RNA-Seq
  • 12. Introduction Why? Normalization Methods Differential Expression Examples Why Normalization? RNA-seq biases Differences on the counts distribution among samples. Sonia Tarazona Differential Expression in RNA-Seq
  • 13. Introduction Why? Normalization Methods Differential Expression Examples Why Normalization? RNA-seq biases Influence of sequencing depth: The higher sequencing depth, the higher counts. Dependence on gene length: Counts are proportional to the transcript length times the mRNA expression level. Differences on the counts distribution among samples. Options 1 Normalization: Counts should be previously corrected in order to minimize these biases. 2 Statistical model should take them into account. Sonia Tarazona Differential Expression in RNA-Seq
  • 14. Introduction Why? Normalization Methods Differential Expression Examples Some Normalization Methods RPKM (Mortazavi et al., 2008): Counts are divided by the transcript length (kb) times the total number of millions of mapped reads. number of reads of the region RPKM = total reads × region length 1000000 1000 Upper-quartile (Bullard et al., 2010): Counts are divided by upper-quartile of counts for transcripts with at least one read. TMM (Robinson and Oshlack, 2010): Trimmed Mean of M values. Quantiles, as in microarray normalization (Irizarry et al., 2003). FPKM (Trapnell et al., 2010): Instead of counts, Cufflinks software generates FPKM values (Fragments Per Kilobase of exon per Million fragments mapped) to estimate gene expression, which are analogous to RPKM. Sonia Tarazona Differential Expression in RNA-Seq
  • 15. Introduction Why? Normalization Methods Differential Expression Examples Example on RPKM normalization Sequencing depth Marioni et al., 2008 Kidney: 9.293.530 reads Liver: 8.361.601 reads RPKM normalization Length: 1500 bases. 620 Kidney: RPKM = 9293530 × 1500 = 44,48 106 1000 746 Liver: RPKM = 8361601 × 1500 = 59,48 106 1000 Sonia Tarazona Differential Expression in RNA-Seq
  • 16. Introduction Why? Normalization Methods Differential Expression Examples Example Sonia Tarazona Differential Expression in RNA-Seq
  • 17. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Outline 1 Introduction 2 Normalization 3 Differential Expression Methods NOISeq Exercises Concluding Sonia Tarazona Differential Expression in RNA-Seq
  • 18. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Differential Expression Parametric approaches Counts are modeled using known probability distributions such as Binomial, Poisson, Negative Binomial, etc. R packages in Bioconductor: edgeR (Robinson et al., 2010): Exact test based on Negative Binomial distribution. DESeq (Anders and Huber, 2010): Exact test based on Negative Binomial distribution. DEGseq (Wang et al., 2010): MA-plots based methods (MATR and MARS), assuming Normal distribution for M|A. baySeq (Hardcastle et al., 2010): Estimation of the posterior likelihood of differential expression (or more complex hypotheses) via empirical Bayesian methods using Poisson or NB distributions. Sonia Tarazona Differential Expression in RNA-Seq
  • 19. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Differential Expression Non-parametric approaches No assumptions about data distribution are made. Fisher’s exact test (better with normalized counts). cuffdiff (Trapnell et al., 2010): Based on entropy divergence for relative transcript abundances. Divergence is a measurement of the ”distance”between the relative abundances of transcripts in two difference conditions. NOISeq (Tarazona et al., coming soon) −→ http://bioinfo.cipf.es/noiseq/ Sonia Tarazona Differential Expression in RNA-Seq
  • 20. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Differential Expression Drawbacks of differential expression methods Parametric assumptions: Are they fulfilled? Need of replicates. Problems to detect differential expression in genes with low counts. Sonia Tarazona Differential Expression in RNA-Seq
  • 21. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Differential Expression Drawbacks of differential expression methods Parametric assumptions: Are they fulfilled? Need of replicates. Problems to detect differential expression in genes with low counts. NOISeq Non-parametric method. No need of replicates. Less influenced by sequencing depth or number of counts. Sonia Tarazona Differential Expression in RNA-Seq
  • 22. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq Outline Signal distribution. Computing changes in expression of each gene between the two experimental conditions. Noise distribution. Distribution of changes in expression values when comparing replicates within the same condition. Differential expression. Comparing signal and noise distributions to determine differentially expressed genes. Sonia Tarazona Differential Expression in RNA-Seq
  • 23. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq NOISeq-real Replicates are available for each condition. Compute M-D in noise by comparing each pair of replicates within the same condition. Sonia Tarazona Differential Expression in RNA-Seq
  • 24. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq NOISeq-real Replicates are available for each condition. Compute M-D in noise by comparing each pair of replicates within the same condition. NOISeq-sim No replicates are available at all. NOISeq simulates technical replicates for each condition. The replicates are generated from a multinomial distribution taking the counts in the only sample as the probabilities for the distribution. Compute M-D in noise by comparing each pair of simulated replicates within the same condition. Sonia Tarazona Differential Expression in RNA-Seq
  • 25. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq: Signal distribution 1 Calculate the expression of each gene at each experimental condition (expressioni , for i = 1, 2). Technical replicates: expressioni = sum(valuesi ) Biological replicates: expressioni = mean(valuesi ) No replicates: expressioni = valuei 2 Compute for each gene the statistics measuring changes in expression: expression M = log2 expression1 2 D = |expression1 –expression2 | Sonia Tarazona Differential Expression in RNA-Seq
  • 26. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq: Noise distribution With replicates (NOISeq-real): For each condition, all the possible comparisons among the available replicates are used to compute M and D values. All the M-D values for all the comparisons, genes and for both conditions are pooled together to create the noise distribution. If the number of pairwise comparisons for a certain condition is higher than 30, only 30 randomly selected comparisons are made. Without replicates (NOISeq-sim): Replicates are simulated and the same procedure is used to derive noise distribution. Sonia Tarazona Differential Expression in RNA-Seq
  • 27. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq: Differential expression Probability for each gene of being differentially expressed: It is obtained by comparing M-D values of that gene against noise distribution and computing the number of cases in which the values in noise are lower than the values for signal. A gene is declared as differentially expressed if this probability is higher than q. The threshold q is set to 0.8 by default, since this value is equivalent to an odds of 4:1 (the gene in 4 times more likely to be differentially expressed than not). Sonia Tarazona Differential Expression in RNA-Seq
  • 28. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq Input Data: datos1, datos2 Features length: long (only if length correction is to be applied) Normalization: norm = {“rpkm”, “uqua”, “tmm”, “none”}; lc = “length correction” Replicates: repl = {“tech”, “bio”} Simulation: nss = “number of replicates to be simulated”; pnr = “total counts in each simulated replicate”; v = variability for pnr Probability cutoff: q (≥ 0,8) Others: k = 0,5 Sonia Tarazona Differential Expression in RNA-Seq
  • 29. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding NOISeq Output Differential expression probability. For each feature, probability of being differentially expressed. Differentially expressed features. List of features names which are differentially expressed according to q cutoff. M-D values. For signal (between conditions and for each feature) and for noise (among replicates within the same condition, pooled). Sonia Tarazona Differential Expression in RNA-Seq
  • 30. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Exercises Execute in an R terminal the code provided in exerciseDEngs.r Reading data simCount <- readData(file = "simCount.txt", cond1 = c(2:6), cond2 = c(7:11), header = TRUE) lapply(simCount, head) depth <- as.numeric(sapply(simCount, colSums)) Differential Expression by NOISeq NOISeq-real res1noiseq <- noiseq(simCount[[1]], simCount[[2]], nss = 0, q = 0.8, repl = ‘‘tech’’) NOISeq-sim res2noiseq <- noiseq(as.matrix(rowSums(simCount[[1]])), as.matrix(rowSums(simCount[[2]])), q = 0.8, pnr = 0.2, nss = 5) See the tutorial in http://bioinfo.cipf.es/noiseq/ for more information. Sonia Tarazona Differential Expression in RNA-Seq
  • 31. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding Some remarks Experimental design is decisive to answer correctly your biological questions. Differential expression methods for RNA-Seq data must be different to microarray methods. Normalization should be applied to raw counts, at least a library size correction. Sonia Tarazona Differential Expression in RNA-Seq
  • 32. Methods Introduction NOISeq Normalization Exercises Differential Expression Concluding For Further Reading Oshlack, A., Robinson, M., and Young, M. (2010) From RNA-seq reads to differential expression results. Genome Biology , 11, 220+. Review on RNA-Seq, including differential expression. Auer, P.L., and Doerge R.W. (2010) Statistical Design and Analysis of RNA Sequencing Data. Genetics, 185, 405-416. Bullard, J. H., Purdom, E., Hansen, K. D., and Dudoit, S. (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics, 11, 94+. Normalization methods (including Upper Quartile) and differential expression. Tarazona, S., Garc´ ıa-Alcalde, F., Dopazo, J., Ferrer A., and Conesa, A. (submitted) Differential expression in RNA-seq: a matter of depth. Sonia Tarazona Differential Expression in RNA-Seq