SlideShare a Scribd company logo
Date
Maarten Leerkes PhD
Genome Analysis Specialist
Bioinformatics and Computational Biosciences Branch
Office of Cyber Infrastructure and Computational Biology
RNA-seq with R-bioconductor
Part 1.
BCBB: A Branch Devoted to Bioinformatics and
Computational Biosciences
§  Researchers’ time is increasingly important
§  BCBB saves our collaborators time and effort
§  Researchers speed projects to completion using
BCBB consultation and development services
§  No need to hire extra post docs or use external
consultants or developers
2
BCBB Staff
3
Bioinformatics
Software
Developers
Computational
Biologists
Project Managers
and Analysts
Contact BCBB…
§  “NIH Users: Access a menu of BCBB services on
the NIAID Intranet:
•  http://bioinformatics.niaid.nih.gov/
§  Outside of NIH –
•  search “BCBB” on the NIAID Public Internet Page:
www.niaid.nih.gov
– or – use this direct link
§  Email us at:
•  ScienceApps@niaid.nih.gov
4
Seminar Follow-Up Site
§  For access to past recordings, handouts, slides visit this site from the
NIH network: http://collab.niaid.nih.gov/sites/research/SIG/
Bioinformatics/
5
1. Select a
Subject Matter
View:
•  Seminar Details
•  Handout and
Reference Docs
•  Relevant Links
•  Seminar
Recording Links
2. Select a
Topic
Recommended Browsers:
•  IE for Windows,
•  Safari for Mac (Firefox on a
Mac is incompatible with
NIH Authentication
technology)
Login
•  If prompted to log in use
“NIH” in front of your
username
ScienceApps@niaid.nih.gov
https://bioinformatics.niaid.nih.gov (NIAID intranet)
Structural Biology
Phylogenetics
Statistics
Sequence Analysis
Molecular Dynamics
Microarray Analysis
BCBB: A Branch Devoted to Bioinformatics and
Computational Biosciences
Topics
§  What is R
§  What is Bioconductor
§  What is RNAseq
7
What is R
§  R is a programming language and software
environment for statistical computing and graphics.
The R language is widely used among statisticians
and data miners for developing statistical software[2]
[3] and data analysis.
8
What is R
§  R is an implementation of the S programming
language combined with lexical scoping semantics
inspired by Scheme. S was created by John
Chambers while at Bell Labs. There are some
important differences, but much of the code written for
S runs unaltered.
9
What is R
§  R is a GNU project. The source code for the R
software environment is written primarily in C, Fortran,
and R. R is freely available under the GNU General
Public License, and pre-compiled binary versions are
provided for various operating systems. R uses a
command line interface; there are also several
graphical front-ends for it.
10
DOWNLOAD R FROM CRAN:
http://cran.r-project.org/
11
12
Topics
§  What is R
§  What is Bioconductor
§  What is RNAseq
13
What is bioconductor
14
Topics
§  What is R
§  What is Bioconductor
§  What is RNAseq
15
What is RNAseq
§  RNA-seq (RNA Sequencing), also called Whole
Transcriptome Shotgun Sequencing (WTSS), is a
technology that uses the capabilities of next-
generation sequencing to reveal a snapshot of
RNA presence and quantity from a genome at a
given moment in time.
16
Topics
§  What is R
§  What is Bioconductor
§  What is RNAseq
§  Comes together in: RNA-seq with R-bioconductor
17
Different kinds of objects in R
§  Objects.
§  The following data objects exist in R:
§  vectors
§  lists
§  arrays
§  matrices
§  tables
§  data frames
§  Some of these are more important than others. And
there are more.
18
19
20
A data frame is used for storing data
tables. It is a list of vectors of equal length.
§  A data frame is a table, or two-dimensional array-like
structure, in which each column contains
measurements on one variable, and each row
contains one case. As we shall see, a "case" is not
necessarily the same as an experimental subject or
unit, although they are often the same.
21
Combine list of data frames into single data frame, add
column with list index: list of vectors of equal length.
22
Methods: software carpentry:
http://swcarpentry.github.io/r-novice-inflammation/01-starting-with-data.html
23
Rna-seq with R
Demo: easyRNAseq
Source(“c:windowsmynamerna_seq_tutorial.R”)
source("/vol/maarten/rna_seq_tutorial2.R")
http://bioscholar.com/genomics/bioconductor-packages-analysis-rna-seq-data/
Current working directory cwd
25
Topics: start R
26
Topics: use R console and R command
line
27
Topics: use R console and R command
line
28
Topics: use R console and R command
line
29
Topics: use R console and R command
line
30
Topics: use R console and R command
line
31
Topics
§  What is R
§  What is Bioconductor
§  What is RNAseq
32
Sequencing by synthesis
§  Intro to Sequencing by Synthesis:
§  https://www.youtube.com/watch?v=HMyCqWhwB8E
34
FASTQ read with 50nt in Illumina format (ASCII_BASE=33).
There are always four lines per read.
35
36
Paired end: read 1 in one fastq file
37
Paired end: read 2 in another fastq file
38
Numerous	
  possible	
  analysis	
  strategies	
  
§  There	
  is	
  no	
  one	
  ‘correct’	
  way	
  to	
  
analyze	
  RNA-­‐seq	
  data	
  	
  
§  Two	
  major	
  branches	
  
•  Direct	
  alignment	
  of	
  reads	
  
(spliced	
  or	
  unspliced)	
  to	
  genome	
  
or	
  transcriptome	
  
•  Assembly	
  of	
  reads	
  followed	
  by	
  
alignment*	
  
*Assembly is the only option when working with a creature with no genome sequence,
alignment of contigs may be to ESTs, cDNAs etc
or transcriptome
Image from Haas & Zody, 2010
40
Illumina clonal
expansion
followed by image
processing
Pile up sequences to reference genome
42
SAM format: what are sam/bam files
http://biobits.org/samtools_primer.html
43
44
RNA	
  sequencing:	
  abundance	
  comparisons	
  
between	
  two	
  or	
  more	
  condi9ons	
  /	
  phenotypes	
  
CondiCon	
  1	
  
(normal	
  Cssue)	
  
CondiCon	
  2	
  
(diseased	
  Cssue)	
  
Isolate	
  RNAs	
  
Sequence	
  ends	
  
100s	
  of	
  millions	
  of	
  paired	
  reads	
  
10s	
  of	
  billions	
  bases	
  of	
  sequence	
  
Generate	
  cDNA,	
  fragment,	
  
size	
  select,	
  add	
  linkers	
  Samples	
  of	
  interest	
  
Map	
  to	
  genome,	
  
transcriptome,	
  and	
  
predicted	
  exon	
  
junc9ons	
  
Downstream	
  analysis	
  
Compare two samples for abundance
differences
46
Transcript abundances differ in pile-up
47
Genes have ‘structure’, solve by mapping
§  This leads to for example analysis of intron-exon
structure
Genes and transcripts
Currrent
paradigm:
“cuff-suit”
50
Common	
  analysis	
  goals	
  of	
  RNA-­‐Seq	
  	
  analysis	
  	
  
(what	
  can	
  you	
  ask	
  of	
  the	
  data?)	
  
§  Gene	
  expression	
  and	
  differenCal	
  expression	
  
§  AlternaCve	
  expression	
  analysis	
  
§  Transcript	
  discovery	
  and	
  annotaCon	
  
§  Allele	
  specific	
  expression	
  
•  RelaCng	
  to	
  SNPs	
  or	
  mutaCons	
  
§  MutaCon	
  discovery	
  
§  Fusion	
  detecCon	
  
§  RNA	
  ediCng	
  
Back	
  to	
  the	
  demo	
  
§  IntroducCon	
  to	
  RNA	
  sequencing	
  
§  RaConale	
  for	
  RNA	
  sequencing	
  (versus	
  DNA	
  sequencing)	
  
§  Hands	
  on	
  tutorial	
  
Rna-seq with R
Demo: easyRNAseq
Source(“c:windowsmynamerna_seq_tutorial.R”)
source("/vol/maarten/rna_seq_tutorial2.R")
http://bioscholar.com/genomics/bioconductor-packages-analysis-rna-seq-data/
54
Deseq and DEseq2
§  method based on the negative binomial distribution,
with variance and mean linked by local regression
§  DEseq2:
§  No demo scripts available yet:
§  http://www.bioconductor.org/packages/release/bioc/
vignettes/DESeq2/inst/doc/DESeq2.pdf
55
The empirical frequency distribution of the hybridization signal intensity values for
Affymetrix microarray hybridization data for normal yeast cell genes/ORFs (Jelinsky
and Samson 1999).
Kuznetsov V A et al. Genetics 2002;161:1321-1332
Copyright © 2002 by the Genetics Society of America
Empirical relative frequency distributions of the gene expression levels.
Kuznetsov V A et al. Genetics 2002;161:1321-1332
Copyright © 2002 by the Genetics Society of America
58
59
Empirical (black dots) and fitted (red lines)
dispersion values plotted against the mean of the
normalised counts.
60
Plot of normalised mean versus log2 fold change
for the contrast untreated versus treated.
61
Histogram of p-values from the call to
nbinomTest.
62
MvA plot for the
contrast“treated”vs.“untreated”, using two
treated and only one untreated sample.
63
Heatmaps showing the expression data of
the 30 most highly expressed genes
64
Heatmap showing the Euclidean distances between the
samples as calculated from the variance stabilising
transformation of the count data.
65
Biological effects of condition and libType
66
Mean expression versus log2 fold change
plot. Significant hits (at padj<0.1) are
coloured in red.
67
Per-gene dispersion estimates (shown by
points) and the fitted mean- dispersion
function (red line).
68
Differential exon usage
§  Detecting spliced isoform usage by exon-level
expression analysis
69
Types of splicing
70
expression estimates from a call to testForDEU.
Shown in red is the exon that showed significant
differential exon usage.
71
Normalized counts. As in previous Figure,
with normalized count values of each exon
in each of the samples.
72
estimated effects, but after subtraction of
overall changes in gene expression.
73
Dependence of dispersion on the mean
74
75
Distributions of Fold changes of exon
usage
76
77
Resources: RNA-Seq workflow, gene-level
exploratory analysis and differential expression
78
79
Outline	
  
§  IntroducCon	
  to	
  RNA	
  sequencing	
  
§  RaConale	
  for	
  RNA	
  sequencing	
  (versus	
  DNA	
  sequencing)	
  
§  Hands	
  on	
  tutorial	
  
§  hQp://swcarpentry.github.io/r-­‐novice-­‐inflammaCon/	
  
§  hQp://swcarpentry.github.io/r-­‐novice-­‐inflammaCon/02-­‐func-­‐R.html	
  
§  hQp://www.bioconductor.org/help/workflows/	
  
§  hQp://www.bioconductor.org/packages/release/data/experiment/
html/parathyroidSE.html	
  
§  hQp://www.bioconductor.org/help/workflows/rnaseqGene/	
  
About bioconductor
High-throughput sequence analysis with R and Bioconductor:
http://www.bioconductor.org/help/course-materials/2013/useR2013/
Bioconductor-tutorial.pdf
http://bioconductor.org/packages/2.13/data/experiment/vignettes/
RnaSeqTutorial/inst/doc/RnaSeqTutorial.pdf
Also helpful: http://www.bioconductor.org/help/course-materials/2002/
Summer02Course/Labs/basics.pdf
http://www.nature.com/nprot/journal/v8/n9/
pdf/nprot.2013.099.pdf
82
The End
84

More Related Content

What's hot

Swaati algorithm of alignment ppt
Swaati algorithm of alignment pptSwaati algorithm of alignment ppt
Swaati algorithm of alignment ppt
Swati Kumari
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
Bioinformatics and Computational Biosciences Branch
 
Pathway analysis 2012
Pathway analysis 2012Pathway analysis 2012
Pathway analysis 2012
Stephen Turner
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
KAUSHAL SAHU
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
BITS
 
SWISS-PROT
SWISS-PROTSWISS-PROT
Biological databases
Biological databasesBiological databases
Biological databases
Afra Fathima
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
Alphonsa Joseph
 
Biotechnology information system in india (btis net)
Biotechnology information system in india (btis net)Biotechnology information system in india (btis net)
Biotechnology information system in india (btis net)
KAUSHAL SAHU
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
Mazhar Khan
 
Protein database
Protein databaseProtein database
Protein database
Khalid Hakeem
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
Maté Ongenaert
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
SELF-EXPLANATORY
 
Blast
BlastBlast
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
Swiss prot
Swiss protSwiss prot
Swiss prot
Shikha Thakur
 
Ddbj
DdbjDdbj
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
Nikesh Narayanan
 
Protein databases
Protein databasesProtein databases
Protein databases
bansalaman80
 

What's hot (20)

Swaati algorithm of alignment ppt
Swaati algorithm of alignment pptSwaati algorithm of alignment ppt
Swaati algorithm of alignment ppt
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Pathway analysis 2012
Pathway analysis 2012Pathway analysis 2012
Pathway analysis 2012
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
Biotechnology information system in india (btis net)
Biotechnology information system in india (btis net)Biotechnology information system in india (btis net)
Biotechnology information system in india (btis net)
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Protein database
Protein databaseProtein database
Protein database
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
 
Blast
BlastBlast
Blast
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Swiss prot
Swiss protSwiss prot
Swiss prot
 
Ddbj
DdbjDdbj
Ddbj
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Protein databases
Protein databasesProtein databases
Protein databases
 

Viewers also liked

Network components and biological network construction methods
Network components and biological network construction methodsNetwork components and biological network construction methods
Network components and biological network construction methods
Bioinformatics and Computational Biosciences Branch
 
Biological networks
Biological networksBiological networks
Biological networks - building and visualizing
Biological networks - building and visualizingBiological networks - building and visualizing
Biological networks - building and visualizing
Bioinformatics and Computational Biosciences Branch
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
Bioinformatics and Computational Biosciences Branch
 

Viewers also liked (8)

Design of experiments
Design of experiments Design of experiments
Design of experiments
 
Network components and biological network construction methods
Network components and biological network construction methodsNetwork components and biological network construction methods
Network components and biological network construction methods
 
Cytoscape
CytoscapeCytoscape
Cytoscape
 
Biological networks
Biological networksBiological networks
Biological networks
 
Biological networks - building and visualizing
Biological networks - building and visualizingBiological networks - building and visualizing
Biological networks - building and visualizing
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
ChIP-seq Theory
ChIP-seq TheoryChIP-seq Theory
ChIP-seq Theory
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 

Similar to RNA-Seq with R-Bioconductor

NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
Bioinformatics and Computational Biosciences Branch
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
HAMNAHAMNA8
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdf
Pushpendra83
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
Jatinder Singh
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)
Gunnar Rätsch
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
rashabakkour
 
Bioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzamanBioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzaman
Sardar Arifuzzaman
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
Justin Johnson
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
university of education,Lahore
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
exRNA Data Analysis Tools in the Genboree Workbench
exRNA Data Analysis Tools in the Genboree WorkbenchexRNA Data Analysis Tools in the Genboree Workbench
exRNA Data Analysis Tools in the Genboree Workbench
exrna
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
Monica Munoz-Torres
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky
 
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing TracesDetecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Thermo Fisher Scientific
 
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
Mackenna Galicia
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
QIAGEN
 
OVium Bioinformatic Solutions
OVium Bioinformatic SolutionsOVium Bioinformatic Solutions
OVium Bioinformatic Solutions
OVium Solutions
 
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSExploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Golden Helix Inc
 

Similar to RNA-Seq with R-Bioconductor (20)

NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdf
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Use of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay DesignUse of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay Design
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzamanBioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzaman
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
exRNA Data Analysis Tools in the Genboree Workbench
exRNA Data Analysis Tools in the Genboree WorkbenchexRNA Data Analysis Tools in the Genboree Workbench
exRNA Data Analysis Tools in the Genboree Workbench
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing TracesDetecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
 
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
OVium Bioinformatic Solutions
OVium Bioinformatic SolutionsOVium Bioinformatic Solutions
OVium Bioinformatic Solutions
 
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSExploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
 

More from Bioinformatics and Computational Biosciences Branch

Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
Bioinformatics and Computational Biosciences Branch
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
Bioinformatics and Computational Biosciences Branch
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
Bioinformatics and Computational Biosciences Branch
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Protein function prediction
Protein function predictionProtein function prediction
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
Bioinformatics and Computational Biosciences Branch
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
Bioinformatics and Computational Biosciences Branch
 
Statistical applications in GraphPad Prism
Statistical applications in GraphPad PrismStatistical applications in GraphPad Prism
Statistical applications in GraphPad Prism
Bioinformatics and Computational Biosciences Branch
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Categorical models
Categorical modelsCategorical models
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
Bioinformatics and Computational Biosciences Branch
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
Bioinformatics and Computational Biosciences Branch
 
Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)
Bioinformatics and Computational Biosciences Branch
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
Bioinformatics and Computational Biosciences Branch
 

More from Bioinformatics and Computational Biosciences Branch (20)

Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Introduction to METAGENOTE
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
Protein docking
Protein dockingProtein docking
Protein docking
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
 
Statistical applications in GraphPad Prism
Statistical applications in GraphPad PrismStatistical applications in GraphPad Prism
Statistical applications in GraphPad Prism
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Intro to JMP for statistics
 
Categorical models
Categorical modelsCategorical models
Categorical models
 
Better graphics in R
Better graphics in RBetter graphics in R
Better graphics in R
 
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
GraphPad Prism: Curve fitting
 
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
 

Recently uploaded

general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 

Recently uploaded (20)

general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 

RNA-Seq with R-Bioconductor

  • 1. Date Maarten Leerkes PhD Genome Analysis Specialist Bioinformatics and Computational Biosciences Branch Office of Cyber Infrastructure and Computational Biology RNA-seq with R-bioconductor Part 1.
  • 2. BCBB: A Branch Devoted to Bioinformatics and Computational Biosciences §  Researchers’ time is increasingly important §  BCBB saves our collaborators time and effort §  Researchers speed projects to completion using BCBB consultation and development services §  No need to hire extra post docs or use external consultants or developers 2
  • 4. Contact BCBB… §  “NIH Users: Access a menu of BCBB services on the NIAID Intranet: •  http://bioinformatics.niaid.nih.gov/ §  Outside of NIH – •  search “BCBB” on the NIAID Public Internet Page: www.niaid.nih.gov – or – use this direct link §  Email us at: •  ScienceApps@niaid.nih.gov 4
  • 5. Seminar Follow-Up Site §  For access to past recordings, handouts, slides visit this site from the NIH network: http://collab.niaid.nih.gov/sites/research/SIG/ Bioinformatics/ 5 1. Select a Subject Matter View: •  Seminar Details •  Handout and Reference Docs •  Relevant Links •  Seminar Recording Links 2. Select a Topic Recommended Browsers: •  IE for Windows, •  Safari for Mac (Firefox on a Mac is incompatible with NIH Authentication technology) Login •  If prompted to log in use “NIH” in front of your username
  • 6. ScienceApps@niaid.nih.gov https://bioinformatics.niaid.nih.gov (NIAID intranet) Structural Biology Phylogenetics Statistics Sequence Analysis Molecular Dynamics Microarray Analysis BCBB: A Branch Devoted to Bioinformatics and Computational Biosciences
  • 7. Topics §  What is R §  What is Bioconductor §  What is RNAseq 7
  • 8. What is R §  R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software[2] [3] and data analysis. 8
  • 9. What is R §  R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. S was created by John Chambers while at Bell Labs. There are some important differences, but much of the code written for S runs unaltered. 9
  • 10. What is R §  R is a GNU project. The source code for the R software environment is written primarily in C, Fortran, and R. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems. R uses a command line interface; there are also several graphical front-ends for it. 10
  • 11. DOWNLOAD R FROM CRAN: http://cran.r-project.org/ 11
  • 12. 12
  • 13. Topics §  What is R §  What is Bioconductor §  What is RNAseq 13
  • 15. Topics §  What is R §  What is Bioconductor §  What is RNAseq 15
  • 16. What is RNAseq §  RNA-seq (RNA Sequencing), also called Whole Transcriptome Shotgun Sequencing (WTSS), is a technology that uses the capabilities of next- generation sequencing to reveal a snapshot of RNA presence and quantity from a genome at a given moment in time. 16
  • 17. Topics §  What is R §  What is Bioconductor §  What is RNAseq §  Comes together in: RNA-seq with R-bioconductor 17
  • 18. Different kinds of objects in R §  Objects. §  The following data objects exist in R: §  vectors §  lists §  arrays §  matrices §  tables §  data frames §  Some of these are more important than others. And there are more. 18
  • 19. 19
  • 20. 20
  • 21. A data frame is used for storing data tables. It is a list of vectors of equal length. §  A data frame is a table, or two-dimensional array-like structure, in which each column contains measurements on one variable, and each row contains one case. As we shall see, a "case" is not necessarily the same as an experimental subject or unit, although they are often the same. 21
  • 22. Combine list of data frames into single data frame, add column with list index: list of vectors of equal length. 22
  • 24. Rna-seq with R Demo: easyRNAseq Source(“c:windowsmynamerna_seq_tutorial.R”) source("/vol/maarten/rna_seq_tutorial2.R") http://bioscholar.com/genomics/bioconductor-packages-analysis-rna-seq-data/
  • 27. Topics: use R console and R command line 27
  • 28. Topics: use R console and R command line 28
  • 29. Topics: use R console and R command line 29
  • 30. Topics: use R console and R command line 30
  • 31. Topics: use R console and R command line 31
  • 32. Topics §  What is R §  What is Bioconductor §  What is RNAseq 32
  • 33.
  • 34. Sequencing by synthesis §  Intro to Sequencing by Synthesis: §  https://www.youtube.com/watch?v=HMyCqWhwB8E 34
  • 35. FASTQ read with 50nt in Illumina format (ASCII_BASE=33). There are always four lines per read. 35
  • 36. 36
  • 37. Paired end: read 1 in one fastq file 37
  • 38. Paired end: read 2 in another fastq file 38
  • 39. Numerous  possible  analysis  strategies   §  There  is  no  one  ‘correct’  way  to   analyze  RNA-­‐seq  data     §  Two  major  branches   •  Direct  alignment  of  reads   (spliced  or  unspliced)  to  genome   or  transcriptome   •  Assembly  of  reads  followed  by   alignment*   *Assembly is the only option when working with a creature with no genome sequence, alignment of contigs may be to ESTs, cDNAs etc or transcriptome Image from Haas & Zody, 2010
  • 40. 40
  • 42. Pile up sequences to reference genome 42
  • 43. SAM format: what are sam/bam files http://biobits.org/samtools_primer.html 43
  • 44. 44
  • 45. RNA  sequencing:  abundance  comparisons   between  two  or  more  condi9ons  /  phenotypes   CondiCon  1   (normal  Cssue)   CondiCon  2   (diseased  Cssue)   Isolate  RNAs   Sequence  ends   100s  of  millions  of  paired  reads   10s  of  billions  bases  of  sequence   Generate  cDNA,  fragment,   size  select,  add  linkers  Samples  of  interest   Map  to  genome,   transcriptome,  and   predicted  exon   junc9ons   Downstream  analysis  
  • 46. Compare two samples for abundance differences 46
  • 48. Genes have ‘structure’, solve by mapping §  This leads to for example analysis of intron-exon structure
  • 51. Common  analysis  goals  of  RNA-­‐Seq    analysis     (what  can  you  ask  of  the  data?)   §  Gene  expression  and  differenCal  expression   §  AlternaCve  expression  analysis   §  Transcript  discovery  and  annotaCon   §  Allele  specific  expression   •  RelaCng  to  SNPs  or  mutaCons   §  MutaCon  discovery   §  Fusion  detecCon   §  RNA  ediCng  
  • 52. Back  to  the  demo   §  IntroducCon  to  RNA  sequencing   §  RaConale  for  RNA  sequencing  (versus  DNA  sequencing)   §  Hands  on  tutorial  
  • 53. Rna-seq with R Demo: easyRNAseq Source(“c:windowsmynamerna_seq_tutorial.R”) source("/vol/maarten/rna_seq_tutorial2.R") http://bioscholar.com/genomics/bioconductor-packages-analysis-rna-seq-data/
  • 54. 54
  • 55. Deseq and DEseq2 §  method based on the negative binomial distribution, with variance and mean linked by local regression §  DEseq2: §  No demo scripts available yet: §  http://www.bioconductor.org/packages/release/bioc/ vignettes/DESeq2/inst/doc/DESeq2.pdf 55
  • 56. The empirical frequency distribution of the hybridization signal intensity values for Affymetrix microarray hybridization data for normal yeast cell genes/ORFs (Jelinsky and Samson 1999). Kuznetsov V A et al. Genetics 2002;161:1321-1332 Copyright © 2002 by the Genetics Society of America
  • 57. Empirical relative frequency distributions of the gene expression levels. Kuznetsov V A et al. Genetics 2002;161:1321-1332 Copyright © 2002 by the Genetics Society of America
  • 58. 58
  • 59. 59
  • 60. Empirical (black dots) and fitted (red lines) dispersion values plotted against the mean of the normalised counts. 60
  • 61. Plot of normalised mean versus log2 fold change for the contrast untreated versus treated. 61
  • 62. Histogram of p-values from the call to nbinomTest. 62
  • 63. MvA plot for the contrast“treated”vs.“untreated”, using two treated and only one untreated sample. 63
  • 64. Heatmaps showing the expression data of the 30 most highly expressed genes 64
  • 65. Heatmap showing the Euclidean distances between the samples as calculated from the variance stabilising transformation of the count data. 65
  • 66. Biological effects of condition and libType 66
  • 67. Mean expression versus log2 fold change plot. Significant hits (at padj<0.1) are coloured in red. 67
  • 68. Per-gene dispersion estimates (shown by points) and the fitted mean- dispersion function (red line). 68
  • 69. Differential exon usage §  Detecting spliced isoform usage by exon-level expression analysis 69
  • 71. expression estimates from a call to testForDEU. Shown in red is the exon that showed significant differential exon usage. 71
  • 72. Normalized counts. As in previous Figure, with normalized count values of each exon in each of the samples. 72
  • 73. estimated effects, but after subtraction of overall changes in gene expression. 73
  • 74. Dependence of dispersion on the mean 74
  • 75. 75
  • 76. Distributions of Fold changes of exon usage 76
  • 77. 77
  • 78. Resources: RNA-Seq workflow, gene-level exploratory analysis and differential expression 78
  • 79. 79
  • 80. Outline   §  IntroducCon  to  RNA  sequencing   §  RaConale  for  RNA  sequencing  (versus  DNA  sequencing)   §  Hands  on  tutorial   §  hQp://swcarpentry.github.io/r-­‐novice-­‐inflammaCon/   §  hQp://swcarpentry.github.io/r-­‐novice-­‐inflammaCon/02-­‐func-­‐R.html   §  hQp://www.bioconductor.org/help/workflows/   §  hQp://www.bioconductor.org/packages/release/data/experiment/ html/parathyroidSE.html   §  hQp://www.bioconductor.org/help/workflows/rnaseqGene/  
  • 81. About bioconductor High-throughput sequence analysis with R and Bioconductor: http://www.bioconductor.org/help/course-materials/2013/useR2013/ Bioconductor-tutorial.pdf http://bioconductor.org/packages/2.13/data/experiment/vignettes/ RnaSeqTutorial/inst/doc/RnaSeqTutorial.pdf Also helpful: http://www.bioconductor.org/help/course-materials/2002/ Summer02Course/Labs/basics.pdf
  • 83.