Workshop on :
“Next Generation Sequencing: The Basic Principles and Data Analysis of RNA-Seq”
Why Transcriptome?
Why RNA-Seq?
ENCODE answers….
Mohammad Hossein Banabazi
Scientific Board
Department of Biotechnology
Animal Science Research Institute of IRAN (ASRI)
15-17 April 2014
Concepts
Systems Biology
Bioinformatics
Omics Data
20 June 20163
Systems Biology is about integrating data from different
sources to provide a more comprehensive answer to a
given biological question
20 June 20164
-ome Field of study (-omics) Collection of
Exome Exomics Exons in a genome
Genome Genomics (Classical genetics) Genes (DNA sequences/Chromosomes)
Interactome Interactomics All interactions
Metabolome Metabolomics Metabolites
Metagenome Metagenomics Genetic material found in an environmental sample
Phenome Phenomics Phenotypes
Proteome Proteomics Proteins
Transcriptome Transcriptomics mRNA transcripts
Omics Data
20 June 20165
20 June 20166
Susceptible
Resistant
Network Dynamics
7
Resistant Susceptible
24 Most Offending
Network Dynamics
8
20 June 20169
The Encyclopedia of DNA Elements (ENCODE) is a public research project
launched by the US National Human Genome Research Institute(NHGRI) in
September 2003. Intended as a follow-up to the Human Genome
Project (Genomic Research), the ENCODE project aims to identify all functional
elements in the human genome.
The project involves a worldwide consortium of research groups, and data
generated from this project can be accessed through public databases.
20 June 201610
20 June 201611
Why Transcriptome?
“ENCODE has played an important role in changing our
concept of the gene,”
A decade long project, the Encyclopedia of DNA
Elements (ENCODE), has found that 80% of the human
genome serves some purpose, biochemically speaking.
Pennisi, E. (2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161.
20 June 201612
Why Transcriptome?
Pennisi, E. (2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161.
The ENCODE effort has revealed that a gene’s regulation
is far more complex than previously thought, being
influenced by multiple stretches of regulatory DNA
located both near and far from the gene itself and by
strands of RNA not translated into proteins, so-called
noncoding RNA.
20 June 201613
Why Transcriptome?
ENCODE has played an important role in changing our
concept of the gene. As a result of ENCODE, the
fundamental unit of the genome and the basic unit of
heredity should be the transcript—the piece of RNA
decoded from DNA—and not the gene.
Pennisi, E. (2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161.
20 June 201614
Nature Methods | VOL.5 NO.7 | JULY 2008
What’s RNA-Seq?
a digital measure of the presence and prevalence of
transcripts, in contrast to the analog-style signals obtained
from fluorescent dye–based microarrays
20 June 201615
????????????
????????????
????????????????????
???????????
???????????
??????
??????
??????
?????? ?????? ??????
??????
???? ?????
AACGTT
CTAACG
TTAGCA ACCGAC
ATGGCA
TTGTCA
CGCATG GTCACT
Extract all mRNA
Prepare a library of
cDNA fragments
Sequence
fragments
What’s RNA-Seq?
RNAseq refers to the method of using Next-Generation Sequencing (NGS) technology to
measure RNA levels.
20 June 201616
Align Sequences to Genome
Gene A
17
Gene B Gene C GeneD
TTGTCA
CGCATG GTCACT
TTAGCA ACCGAC
ATGGCA
AACGTT
CTAACG
Gene ID Sample1
A 3
B 3
C 0
D 2For a given gene, the number of reads
aligned to the gene measures its expression
level.
20 June 201617
20 June 201618
20 June 201619
Some Advantages of RNAseq over Microarrays
 Microarrays measure only genes corresponding to
predetermined probes on a microarray while RNAseq
measures any transcripts in a sample.
 With RNAseq, there is no need to identify probes prior to
measurement or to build a microarray.
 RNAseq provides count data which may be closer, at least in
principle, to the amount of mRNA produced by a gene than
the fluorescence measures produced with microarray
technology.
20 June 201620
Some Advantages of RNAseq over Microarrays
 RNAseq provides information about transcript sequence in
addition to information about transcript abundance.
 With RNAseq, it is possible to separately measure the
expression of different transcripts that would be difficult to
separately measure with microarray technology due to cross
hybridization.
 Sequence information also permits the identification of
alternative splicing, allele specific expression, single
nucleotide polymorphisms (SNPs), and other forms of
sequence variation.
20 June 201621
Further Reading:
 Flintoft, L. (2008) Transcriptomics: Digging deep with RNA-Seq, Nat Rev Genet, 9, 568-568.
 Haas, B.J. and Zody, M.C. (2010) Advancing RNA-seq analysis, Nature Biotech., 28, 421-423.
 Levin, J., et al. (2010) Development and evaluation of RNA-seq methods, Genome Biology, 11, P26.
 Marguerat, S. and Bahler, J. (2010) RNA-seq: from technology to biology, Cell. Mol. Life Sci., 67, 569-
579.
 Mortazavi, A., et al. (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq, Nat
Methods, 5, 621-628.
 Wang, Z., Gerstein, M. and Snyder, M. (2009) RNA-seq: a revolutionary tool for transcriptomics,
Nature Rev. Genet., 10, 57-63.
 Wilhelm, B.T., et al. (2010) Defining transcribed regions using RNA-seq, Nat Protoc, 5, 255-266.
20 June 201622
RNA-Seq Databases
20 June 201623
20 June 201624
20 June 201625
20 June 201626
Converting of sra format to fastq file
20 June 201627
20 June 201628
20 June 201629
20 June 201630
20 June 201631
20 June 201632
20 June 201633
Reference Genomes
20 June 201634
20 June 201635
20 June 201636
20 June 201637
20 June 201638
20 June 201639
20 June 201640
20 June 201641
20 June 201642
20 June 201643
20 June 201644
20 June 201645
20 June 201646
20 June 201647
20 June 201648
20 June 201649
20 June 201650
20 June 201651
20 June 201652
Special Thanks to:
Welcome to your questions….

Why Transcriptome? Why RNA-Seq? ENCODE answers….

  • 2.
    Workshop on : “NextGeneration Sequencing: The Basic Principles and Data Analysis of RNA-Seq” Why Transcriptome? Why RNA-Seq? ENCODE answers…. Mohammad Hossein Banabazi Scientific Board Department of Biotechnology Animal Science Research Institute of IRAN (ASRI) 15-17 April 2014
  • 3.
  • 4.
    Systems Biology isabout integrating data from different sources to provide a more comprehensive answer to a given biological question 20 June 20164
  • 5.
    -ome Field ofstudy (-omics) Collection of Exome Exomics Exons in a genome Genome Genomics (Classical genetics) Genes (DNA sequences/Chromosomes) Interactome Interactomics All interactions Metabolome Metabolomics Metabolites Metagenome Metagenomics Genetic material found in an environmental sample Phenome Phenomics Phenotypes Proteome Proteomics Proteins Transcriptome Transcriptomics mRNA transcripts Omics Data 20 June 20165
  • 6.
  • 7.
  • 8.
    Resistant Susceptible 24 MostOffending Network Dynamics 8
  • 9.
  • 10.
    The Encyclopedia ofDNA Elements (ENCODE) is a public research project launched by the US National Human Genome Research Institute(NHGRI) in September 2003. Intended as a follow-up to the Human Genome Project (Genomic Research), the ENCODE project aims to identify all functional elements in the human genome. The project involves a worldwide consortium of research groups, and data generated from this project can be accessed through public databases. 20 June 201610
  • 11.
  • 12.
    Why Transcriptome? “ENCODE hasplayed an important role in changing our concept of the gene,” A decade long project, the Encyclopedia of DNA Elements (ENCODE), has found that 80% of the human genome serves some purpose, biochemically speaking. Pennisi, E. (2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161. 20 June 201612
  • 13.
    Why Transcriptome? Pennisi, E.(2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161. The ENCODE effort has revealed that a gene’s regulation is far more complex than previously thought, being influenced by multiple stretches of regulatory DNA located both near and far from the gene itself and by strands of RNA not translated into proteins, so-called noncoding RNA. 20 June 201613
  • 14.
    Why Transcriptome? ENCODE hasplayed an important role in changing our concept of the gene. As a result of ENCODE, the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. Pennisi, E. (2012) ENCODE Project Writes Eulogy for Junk DNA, Science, 337, 1159-1161. 20 June 201614
  • 15.
    Nature Methods |VOL.5 NO.7 | JULY 2008 What’s RNA-Seq? a digital measure of the presence and prevalence of transcripts, in contrast to the analog-style signals obtained from fluorescent dye–based microarrays 20 June 201615
  • 16.
    ???????????? ???????????? ???????????????????? ??????????? ??????????? ?????? ?????? ?????? ?????? ?????? ?????? ?????? ????????? AACGTT CTAACG TTAGCA ACCGAC ATGGCA TTGTCA CGCATG GTCACT Extract all mRNA Prepare a library of cDNA fragments Sequence fragments What’s RNA-Seq? RNAseq refers to the method of using Next-Generation Sequencing (NGS) technology to measure RNA levels. 20 June 201616
  • 17.
    Align Sequences toGenome Gene A 17 Gene B Gene C GeneD TTGTCA CGCATG GTCACT TTAGCA ACCGAC ATGGCA AACGTT CTAACG Gene ID Sample1 A 3 B 3 C 0 D 2For a given gene, the number of reads aligned to the gene measures its expression level. 20 June 201617
  • 18.
  • 19.
  • 20.
    Some Advantages ofRNAseq over Microarrays  Microarrays measure only genes corresponding to predetermined probes on a microarray while RNAseq measures any transcripts in a sample.  With RNAseq, there is no need to identify probes prior to measurement or to build a microarray.  RNAseq provides count data which may be closer, at least in principle, to the amount of mRNA produced by a gene than the fluorescence measures produced with microarray technology. 20 June 201620
  • 21.
    Some Advantages ofRNAseq over Microarrays  RNAseq provides information about transcript sequence in addition to information about transcript abundance.  With RNAseq, it is possible to separately measure the expression of different transcripts that would be difficult to separately measure with microarray technology due to cross hybridization.  Sequence information also permits the identification of alternative splicing, allele specific expression, single nucleotide polymorphisms (SNPs), and other forms of sequence variation. 20 June 201621
  • 22.
    Further Reading:  Flintoft,L. (2008) Transcriptomics: Digging deep with RNA-Seq, Nat Rev Genet, 9, 568-568.  Haas, B.J. and Zody, M.C. (2010) Advancing RNA-seq analysis, Nature Biotech., 28, 421-423.  Levin, J., et al. (2010) Development and evaluation of RNA-seq methods, Genome Biology, 11, P26.  Marguerat, S. and Bahler, J. (2010) RNA-seq: from technology to biology, Cell. Mol. Life Sci., 67, 569- 579.  Mortazavi, A., et al. (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq, Nat Methods, 5, 621-628.  Wang, Z., Gerstein, M. and Snyder, M. (2009) RNA-seq: a revolutionary tool for transcriptomics, Nature Rev. Genet., 10, 57-63.  Wilhelm, B.T., et al. (2010) Defining transcribed regions using RNA-seq, Nat Protoc, 5, 255-266. 20 June 201622
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Converting of sraformat to fastq file 20 June 201627
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
    Welcome to yourquestions….