Differential gene expression
Upcoming SlideShare
Loading in...5
×
 

Differential gene expression

on

  • 2,258 views

This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against ...

This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.

Statistics

Views

Total Views
2,258
Views on SlideShare
2,256
Embed Views
2

Actions

Likes
0
Downloads
108
Comments
0

2 Embeds 2

http://www.linkedin.com 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • http://2.bp.blogspot.com/_BPr6hpMG0tg/TSZdkYDcRvI/AAAAAAAAAjY/ReScIkWNySg/s1600/drink.jpghttp://www.sciencemag.org/content/291/5507/1260.full?sid=23d07e07-ccc5-4b15-8e6d-934a02e9580chttp://biostar.stackexchange.com/questions/6638/rna-seq-analysis

Differential gene expression Differential gene expression Presentation Transcript

  • [Pink Sherbet Photography]
    RNAseq analysis: Differential gene expression (2/2)
    Hopscotch and isoforms
    August 25, 2011
  • Reads->alignment to reference genome->transcript assembly
    Resulting file type: BAM, gff/bed
    “What transcripts are in my samples?”
    August 25, 2011
    Transcript assembly
    Projects
    Fastq
    Mapping
    Quick recap: Mapping and transcript assembly
    Garber M, Grabherr MG, Guttman M, Trapnell C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 2011 PMID: 21623353.
  • RNAseq analysis question
    Is there a difference in the transcriptome of two different conditions ?
    Quantify expression
    Quantify difference
    August 25, 2011
    Condition1 Condition2
  • RNAseqvsExpression Array
    RNAseq can capture a larger dynamic range
    RNAseq can handle degraded samples
    Gain additional information
    New transcripts
    (New) isoforms
    Variants
    August 25, 2011
    Flattening
    out
    Array
    RNA-seq
    Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009 PMID: 19015660
  • Challenges
    Strand-specific methods still biased
    Number of reads not necessarily correlate with transcript abundance
    Longer transcripts have more reads (fragmentation).
    Technical variability between runs causes different number of total reads.
    Lowly abundant does not mean non-functional
    How to quantify expression of isoforms
    August 25, 2011
    Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 2011 PMID: 21191423
    Garber M, Grabherr MG, Guttman M, Trapnell C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 2011 PMID: 21623353.
  • Production Informatics and Bioinformatics
    August 25, 2011
    Produce raw sequence reads
    Basic Production
    Informatics
    Map to genome and generate raw genomic features (e.g. SNPs)
    Advanced
    Production Inform.
    Analyze the data; Uncover the biological meaning
    Bioinformatics
    Research
    Per one-flowcell project
  • Quantifying expression in RNAseq
    Long genes get more reads
    Normalize: fragments per kilobase of transcript per million mapped reads (FPKM)
    FPKM accounts for the dependency between paired-end reads
    August 25, 2011
    Garber M, Grabherr MG, Guttman M, Trapnell C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 2011 PMID: 21623353.
    Oshlack A, Wakefield MJ. Transcript length bias in RNA-seq data confounds systems biology. Biol Direct. 2009 PMID: 19371405
  • Quantifying expression of overlapping isoforms
    We do not know where reads of overlapping isoformsacutally belong
    Alexa-Seq
    counting only the reads that map uniquely to a single isoform
    isoform-expression methods (cufflinks)
    likelihood function modeling the sequencing process (not very accurate for lowly expressed transcripts)
    'exon intersection method’ (analogous to expression microarrays)
    counts reads mapped to its constitutive exons (reduce power for differential expression analysis)
    'exon union method’
    counts all reads mapped to any exon in any of the gene's isoforms (underestimates expression for alternatively spliced genes).
    August 25, 2011
    Garber M, Grabherr MG, Guttman M, Trapnell C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 2011 PMID: 21623353.
  • Differentially expression
    What is a statistically significant difference between a set of measurements (expression of a gene) of two populations (conditions)
    First, estimate variability
    Observe biological variability (needs large numbers of replicates to sample the population).
    model biological variability
    model the count variance across replicates as a nonlinear function of the mean counts using various different parametric approaches (such as the normal and negative binomial distributions) (EdgeR, DESeq, Cuffdiff)
    August 25, 2011
    Garber M, Grabherr MG, Guttman M, Trapnell C. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 2011 PMID: 21623353.
  • Three things to remember
    RNAseq captures larger dynamic range (more sensitive)
    Additional information compared to arrays (e.g. isoforms)
    Need to make assumptions/compromises (quantification, few replicates)
    August 25, 2011
    [cabbit]
  • Next Weeks: NGS Discussion group Jake’s topic
    August 25, 2011
    Two Weeks:
    Abstract: This session will focus on identifying SNPs from whole genome, exome capture or targeted resequencing data. The approaches of mapping, local realigment, recalibration, SNP calling, and SNP recalibration will be introduced and quality metrics discussed.