SlideShare a Scribd company logo
1 of 16
Differential Gene Expression In Rna-
Seq Data For Oral Squamous Cell
Carcinoma Using Bioconductor
19-Apr-15 1
By:
Kasturi P Chandwadkar
BBI 8th sem
BI-12
Overview
• Introduction
• Methodology
• Results
• Conclusion
• References
19-Apr-15 1
INTRODUCTION
• Oral squamous cell carcinoma(OSCC) represents 90% of oral
cancer and the chances increase with the increase in age.
• Techniques for assessing and quantifying RNA by high-
throughput sequencing are collectively known as “RNA- Seq”.
• RNA-Seq has been applied to get the complex transcriptomes
/genes of mammalian samples, including human embryonic
kidney and B-cells, mouse embryonic stem cells, blastomeres,
and different mouse tissues
19-Apr-15 3
ADVANTAGES OF RNA SEQ
• One of the advantages of RNA-Seq over other profiling
technologies like microarray is the ability to query all
transcripts without prior knowledge about the location and
structures of genes.
• RNA-Seq is not limited to detecting transcripts that
correspond to existing genomic sequence.
• RNA-Seq has very low background signal because DNA
sequences can unambiguously mapped to unique regions of
the genome
19-Apr-15 4
R AND BIOCONDUCTOR PACKAGES
• R (http://cran.at.r-project.org) is a comprehensive statistical
environment and programming language for professional data
analysis and graphical display.
• Bioconductor (http://www.bioconductor.org/) provides many
additional R packages for statistical data analysis in different
life science areas, such as tools for microarray, sequence and
genome analysis.
• Packages used for differential gene expression:
• Biostrings
• biomaRt
• baySeq
• DESeq
• edgeR
19-Apr-15 5
Methodology
• RETRIEVAL OF NGS DATA
• The RNA-Seq data (FASTQ files) of oral squamous cell carcinoma was taken
from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) with
accession number GSE20116
• MAPPING OF GENOMIC READS
• The short reads are mapped/aligned to the reference genome using Bowtie.
• GENERATING COUNT FILE
• A count file is matrix in which counts represent the number of times the
genomic region mapped with the reference genome and Id represents the
genomic region annotation.
• GETTING DIFFRENTAL EXPESSION GENES
• edgeR
• DESeq
• baySeq
19-Apr-15 6
RNA-Seq analysis pipeline for detecting DGE
19-Apr-15 7
SHORT READS
ALIGN READS TO REFERENCE GENOME
PREPARE COUNT FILE FROM SAM FILE
GET DIFFERENTIAL GENE EXPRESSION
edgeR baySeqDESeq
List of DEG List of DEG List of DEG
Venn diagram of DEG from three packages
Results
• DATA EXPLORATION
19-Apr-15 8
Outlier in the data
edgeR
Gene id logFC P-value
KRT36 -8.103353 7.842049e-15
SFTPB -8.120520 2.535246e-14
CA3 -6.443105 1.804193e-13
TNNC2 -6.431288 3.040273e-13
MAGEA11 8.881312 1.124744e-12
19-Apr-15 9
TOP 5 DIFFERENTIALLY EXPRESSED GENES
deSeq
19-Apr-15 10
Gene id logFC p-value
FBP2 Infinite 1.576300e-05
TUSC5 Infinite 9.160142e-04
UTS2R Infinite 1.520430e-03
ADIPOQ 7.231394 1.444721e-03
C6 7.162190 9.311805e-05
TOP 5 UPREGULATED GENES
Gene id logFC p-value
EMX1 -Infinite 1.765941e-03
VTCN1 7.289467 4.408178e-07
HOXD11 5.504204 2.041803e-04
HOXC8 5.503361 1.621344e-04
C5orf38 5.428227 9.407919e-05
TOP 5 DOWNGULATED GENES
bayseq
19-Apr-15 11
Gene id LIKELIHOOD FDR
RRAGD 0.9987850 0.001214965
TGFBR3 0.9981198 0.001547566
PYGM 0.9973711 0.001908003
SH3BGRL2 0.9973000 0.002106007
PLA2G2A 0.9972789 0.002229018
TOP 5 DIFFERENTIALLY EXPRESSED GENES
Venn Diagram Of DGE With P-value Less Than 0.01
19-Apr-15 12
Conclusion
• We have demonstrated that our DGE method can be
successfully applied to RNA-Seq samples in tumor and
matched normal tissues.
• By using three different statistical methods for inferring
differential gene expression in oral squamous cell carcinoma
(OSCC) we got 215 genes common using three packages.
• 1054 genes are common between edgeR and DESeq, 217 are
common in between DESeq and baySeq and 278 are common
between edgeR and baySeq.
19-Apr-15 13
Below is table with some of the differential expressed genes in
cancer sample which may be related to cancer.
Gene id Description
KRT36 keratin, type I cuticular
ADIPOQ adiponectin C1Q and collagen domain containing
PLA2G2A Phospholipase A2, group IIA (platelets, synovial fluid)
CEACAM7 Carcinoembryonic antigen-related cell adhesion molecule
SPINK7 Serine peptidase inhibitor, Kazal type 7 (putative)
esophagus cancer related gene 22
ALDH1A2 Aldehyde dehydrogenase 1 family, member
ENDOU Endonuclease, polyU-specific
ANGPTL1 Angiopoietins
GDF10 Growth differentiation factor 10
TUSC5 Tumor suppressor candidate 5
4/19/2015 14
REFERENCES
• [1] Published online 15 October 2008 | Nature 455, 847 (2008) |
doi:10.1038/455847a
• [2] A scaling normalization method for differential expression analysis of RNA-seq
data Mark D Robinson1,2*, Alicia Oshlack1*
• [3] Tumor Transcriptome Sequencing Reveals Allelic Expression Imbalances
Associated with Copy Number Alterations. Brian B. Tuch1., Rebecca R. Laborde2.,
Xing Xu1, Jian Gu3, Christina B. Chung1, Cinna K. Monighetti1.
• [4] Ultrafast and memory-efficient alignment of short DNA sequences to the
human genome
• Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg
• [5] V. Costa, A. Casamassimi, and A. Ciccodicola, “Nutritional genomics era:
opportunities toward a genome-tailored nutritional regimen,” The Journal of
Nutritional Biochemistry, vol. 21, no. 6, pp. 457–467, 2010.
• [6] E. Birney, J. A. Stamatoyannopoulos, A. Dutta, et al., “Identification and analysis
of functional elements in 1% of the human genome by the ENCODE pilot project,”
Nature, vol. 447, no. 7146, pp. 799–816, 2007.
• [7] F. S. Collins, E. S. Lander, J. Rogers, and R. H. Waterson, “Finishing the
euchromatic sequence of the human genome,” Nature, vol. 431, no. 7011, pp.
931–945, 2004.
• [8] International Human Genome Sequencing Consortium, “A haplotype map of
the human genome,” Nature, vol. 437, no. 7063, pp. 1299–1320, 2005.
19-Apr-15 15
19-Apr-15 16

More Related Content

What's hot

Deeper Insight into Transcriptomes! Download the Flyer
Deeper Insight into Transcriptomes! Download the FlyerDeeper Insight into Transcriptomes! Download the Flyer
Deeper Insight into Transcriptomes! Download the FlyerQIAGEN
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphsGenomeInABottle
 
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...QIAGEN
 
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...QIAGEN
 
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...QIAGEN
 
Massively parallel sequencing in forensic genetics
Massively parallel sequencing in forensic geneticsMassively parallel sequencing in forensic genetics
Massively parallel sequencing in forensic geneticsThermo Fisher Scientific
 
Next Generation Sequencing- NGS for COVID19 PPT
Next Generation Sequencing- NGS for COVID19 PPTNext Generation Sequencing- NGS for COVID19 PPT
Next Generation Sequencing- NGS for COVID19 PPTMesele Tilahun
 
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...QIAGEN
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normalGenomeInABottle
 
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...Thermo Fisher Scientific
 
Chemically ligated gRNAs for CRISPR applications.
Chemically ligated gRNAs for CRISPR applications.Chemically ligated gRNAs for CRISPR applications.
Chemically ligated gRNAs for CRISPR applications.Minghong Zhong
 
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...Genome editing as a tool for enhancing disease resistance in crops - Vladimir...
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...OECD Environment
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Thermo Fisher Scientific
 
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...John Blue
 
Discovery and Molecular characterization of virus PPT
 Discovery and Molecular characterization of virus PPT   Discovery and Molecular characterization of virus PPT
Discovery and Molecular characterization of virus PPT Mesele Tilahun
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1QIAGEN
 
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...Douglas Wu
 
Clinical molecular diagnostics for drug guidance
Clinical molecular diagnostics for drug guidanceClinical molecular diagnostics for drug guidance
Clinical molecular diagnostics for drug guidanceNikesh Shah
 
Use of TGIRT for ssDNA-seq of cfDNA in human plasma
Use of TGIRT for ssDNA-seq of cfDNA in human plasmaUse of TGIRT for ssDNA-seq of cfDNA in human plasma
Use of TGIRT for ssDNA-seq of cfDNA in human plasmaDouglas Wu
 

What's hot (20)

Deeper Insight into Transcriptomes! Download the Flyer
Deeper Insight into Transcriptomes! Download the FlyerDeeper Insight into Transcriptomes! Download the Flyer
Deeper Insight into Transcriptomes! Download the Flyer
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...
New Progress in Pyrosequencing for Automated Quantitative Analysis of Bi- or ...
 
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...
Enabling CNV Studies from Single Cells Using Whole Genome Amplification and L...
 
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...
 
Massively parallel sequencing in forensic genetics
Massively parallel sequencing in forensic geneticsMassively parallel sequencing in forensic genetics
Massively parallel sequencing in forensic genetics
 
Next Generation Sequencing- NGS for COVID19 PPT
Next Generation Sequencing- NGS for COVID19 PPTNext Generation Sequencing- NGS for COVID19 PPT
Next Generation Sequencing- NGS for COVID19 PPT
 
Aaa rapd-ageri-2015
Aaa rapd-ageri-2015Aaa rapd-ageri-2015
Aaa rapd-ageri-2015
 
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...
Comparison of Different NGS Library Construction Methods for Single-Cell Sequ...
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
 
Chemically ligated gRNAs for CRISPR applications.
Chemically ligated gRNAs for CRISPR applications.Chemically ligated gRNAs for CRISPR applications.
Chemically ligated gRNAs for CRISPR applications.
 
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...Genome editing as a tool for enhancing disease resistance in crops - Vladimir...
Genome editing as a tool for enhancing disease resistance in crops - Vladimir...
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
 
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...
Dr. Ben Hause - Next Generation Sequencing to Identify Viruses Associated wit...
 
Discovery and Molecular characterization of virus PPT
 Discovery and Molecular characterization of virus PPT   Discovery and Molecular characterization of virus PPT
Discovery and Molecular characterization of virus PPT
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
 
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...
Use of Thermostable Group II Intron Reverse Transcriptases (TGIRTs) for Singl...
 
Clinical molecular diagnostics for drug guidance
Clinical molecular diagnostics for drug guidanceClinical molecular diagnostics for drug guidance
Clinical molecular diagnostics for drug guidance
 
Use of TGIRT for ssDNA-seq of cfDNA in human plasma
Use of TGIRT for ssDNA-seq of cfDNA in human plasmaUse of TGIRT for ssDNA-seq of cfDNA in human plasma
Use of TGIRT for ssDNA-seq of cfDNA in human plasma
 

Similar to undergrad thesis

140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposalGenomeInABottle
 
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell TumorsWhole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell TumorsThermo Fisher Scientific
 
Molecular markers types and applications
Molecular markers types and applicationsMolecular markers types and applications
Molecular markers types and applicationsFAO
 
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...Laura Berry
 
nonsyndromic orofacial cleft and palate
nonsyndromic orofacial cleft and palatenonsyndromic orofacial cleft and palate
nonsyndromic orofacial cleft and palatehad89
 
Bacterial rna sequencing
Bacterial rna sequencingBacterial rna sequencing
Bacterial rna sequencingDynah Perry
 
CRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and HowCRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and HowHorizonDiscovery
 
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlin
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlinIntegrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlin
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlinCirdan
 
Targeted genomic sequencing assay for comprehensive molecular characterizatio...
Targeted genomic sequencing assay for comprehensive molecular characterizatio...Targeted genomic sequencing assay for comprehensive molecular characterizatio...
Targeted genomic sequencing assay for comprehensive molecular characterizatio...Saba Anwer, MPH, MBA
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
 
Aug2013 horizon dx engineered cell line reference materials
Aug2013 horizon dx engineered cell line reference materialsAug2013 horizon dx engineered cell line reference materials
Aug2013 horizon dx engineered cell line reference materialsGenomeInABottle
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansGenomeInABottle
 
ppt AFLP n RFLP by yeni - Copy.pptx
ppt AFLP n RFLP by yeni - Copy.pptxppt AFLP n RFLP by yeni - Copy.pptx
ppt AFLP n RFLP by yeni - Copy.pptxyeniavidha
 
Anis2 Gp Tonini
Anis2   Gp ToniniAnis2   Gp Tonini
Anis2 Gp ToniniATkoala
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...QIAGEN
 

Similar to undergrad thesis (20)

140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
 
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell TumorsWhole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
 
Molecular markers types and applications
Molecular markers types and applicationsMolecular markers types and applications
Molecular markers types and applications
 
2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
 
nonsyndromic orofacial cleft and palate
nonsyndromic orofacial cleft and palatenonsyndromic orofacial cleft and palate
nonsyndromic orofacial cleft and palate
 
Bacterial rna sequencing
Bacterial rna sequencingBacterial rna sequencing
Bacterial rna sequencing
 
CRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and HowCRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and How
 
Dr. Subha Madhavan: G-DOC – Enabling Systems Medicine through Innovations in ...
Dr. Subha Madhavan: G-DOC – Enabling Systems Medicine through Innovations in ...Dr. Subha Madhavan: G-DOC – Enabling Systems Medicine through Innovations in ...
Dr. Subha Madhavan: G-DOC – Enabling Systems Medicine through Innovations in ...
 
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlin
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlinIntegrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlin
Integrative Genomics of Non-Small Cell Lung Cancer by Peter McLoughlin
 
ChIP-seq Theory
ChIP-seq TheoryChIP-seq Theory
ChIP-seq Theory
 
Targeted genomic sequencing assay for comprehensive molecular characterizatio...
Targeted genomic sequencing assay for comprehensive molecular characterizatio...Targeted genomic sequencing assay for comprehensive molecular characterizatio...
Targeted genomic sequencing assay for comprehensive molecular characterizatio...
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Aug2013 horizon dx engineered cell line reference materials
Aug2013 horizon dx engineered cell line reference materialsAug2013 horizon dx engineered cell line reference materials
Aug2013 horizon dx engineered cell line reference materials
 
20140710 1 day1_nist_ercc2.0workshop
20140710 1 day1_nist_ercc2.0workshop20140710 1 day1_nist_ercc2.0workshop
20140710 1 day1_nist_ercc2.0workshop
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 
Molecular markers
Molecular markersMolecular markers
Molecular markers
 
ppt AFLP n RFLP by yeni - Copy.pptx
ppt AFLP n RFLP by yeni - Copy.pptxppt AFLP n RFLP by yeni - Copy.pptx
ppt AFLP n RFLP by yeni - Copy.pptx
 
Anis2 Gp Tonini
Anis2   Gp ToniniAnis2   Gp Tonini
Anis2 Gp Tonini
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
 

undergrad thesis

  • 1. Differential Gene Expression In Rna- Seq Data For Oral Squamous Cell Carcinoma Using Bioconductor 19-Apr-15 1 By: Kasturi P Chandwadkar BBI 8th sem BI-12
  • 2. Overview • Introduction • Methodology • Results • Conclusion • References 19-Apr-15 1
  • 3. INTRODUCTION • Oral squamous cell carcinoma(OSCC) represents 90% of oral cancer and the chances increase with the increase in age. • Techniques for assessing and quantifying RNA by high- throughput sequencing are collectively known as “RNA- Seq”. • RNA-Seq has been applied to get the complex transcriptomes /genes of mammalian samples, including human embryonic kidney and B-cells, mouse embryonic stem cells, blastomeres, and different mouse tissues 19-Apr-15 3
  • 4. ADVANTAGES OF RNA SEQ • One of the advantages of RNA-Seq over other profiling technologies like microarray is the ability to query all transcripts without prior knowledge about the location and structures of genes. • RNA-Seq is not limited to detecting transcripts that correspond to existing genomic sequence. • RNA-Seq has very low background signal because DNA sequences can unambiguously mapped to unique regions of the genome 19-Apr-15 4
  • 5. R AND BIOCONDUCTOR PACKAGES • R (http://cran.at.r-project.org) is a comprehensive statistical environment and programming language for professional data analysis and graphical display. • Bioconductor (http://www.bioconductor.org/) provides many additional R packages for statistical data analysis in different life science areas, such as tools for microarray, sequence and genome analysis. • Packages used for differential gene expression: • Biostrings • biomaRt • baySeq • DESeq • edgeR 19-Apr-15 5
  • 6. Methodology • RETRIEVAL OF NGS DATA • The RNA-Seq data (FASTQ files) of oral squamous cell carcinoma was taken from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) with accession number GSE20116 • MAPPING OF GENOMIC READS • The short reads are mapped/aligned to the reference genome using Bowtie. • GENERATING COUNT FILE • A count file is matrix in which counts represent the number of times the genomic region mapped with the reference genome and Id represents the genomic region annotation. • GETTING DIFFRENTAL EXPESSION GENES • edgeR • DESeq • baySeq 19-Apr-15 6
  • 7. RNA-Seq analysis pipeline for detecting DGE 19-Apr-15 7 SHORT READS ALIGN READS TO REFERENCE GENOME PREPARE COUNT FILE FROM SAM FILE GET DIFFERENTIAL GENE EXPRESSION edgeR baySeqDESeq List of DEG List of DEG List of DEG Venn diagram of DEG from three packages
  • 9. edgeR Gene id logFC P-value KRT36 -8.103353 7.842049e-15 SFTPB -8.120520 2.535246e-14 CA3 -6.443105 1.804193e-13 TNNC2 -6.431288 3.040273e-13 MAGEA11 8.881312 1.124744e-12 19-Apr-15 9 TOP 5 DIFFERENTIALLY EXPRESSED GENES
  • 10. deSeq 19-Apr-15 10 Gene id logFC p-value FBP2 Infinite 1.576300e-05 TUSC5 Infinite 9.160142e-04 UTS2R Infinite 1.520430e-03 ADIPOQ 7.231394 1.444721e-03 C6 7.162190 9.311805e-05 TOP 5 UPREGULATED GENES Gene id logFC p-value EMX1 -Infinite 1.765941e-03 VTCN1 7.289467 4.408178e-07 HOXD11 5.504204 2.041803e-04 HOXC8 5.503361 1.621344e-04 C5orf38 5.428227 9.407919e-05 TOP 5 DOWNGULATED GENES
  • 11. bayseq 19-Apr-15 11 Gene id LIKELIHOOD FDR RRAGD 0.9987850 0.001214965 TGFBR3 0.9981198 0.001547566 PYGM 0.9973711 0.001908003 SH3BGRL2 0.9973000 0.002106007 PLA2G2A 0.9972789 0.002229018 TOP 5 DIFFERENTIALLY EXPRESSED GENES
  • 12. Venn Diagram Of DGE With P-value Less Than 0.01 19-Apr-15 12
  • 13. Conclusion • We have demonstrated that our DGE method can be successfully applied to RNA-Seq samples in tumor and matched normal tissues. • By using three different statistical methods for inferring differential gene expression in oral squamous cell carcinoma (OSCC) we got 215 genes common using three packages. • 1054 genes are common between edgeR and DESeq, 217 are common in between DESeq and baySeq and 278 are common between edgeR and baySeq. 19-Apr-15 13
  • 14. Below is table with some of the differential expressed genes in cancer sample which may be related to cancer. Gene id Description KRT36 keratin, type I cuticular ADIPOQ adiponectin C1Q and collagen domain containing PLA2G2A Phospholipase A2, group IIA (platelets, synovial fluid) CEACAM7 Carcinoembryonic antigen-related cell adhesion molecule SPINK7 Serine peptidase inhibitor, Kazal type 7 (putative) esophagus cancer related gene 22 ALDH1A2 Aldehyde dehydrogenase 1 family, member ENDOU Endonuclease, polyU-specific ANGPTL1 Angiopoietins GDF10 Growth differentiation factor 10 TUSC5 Tumor suppressor candidate 5 4/19/2015 14
  • 15. REFERENCES • [1] Published online 15 October 2008 | Nature 455, 847 (2008) | doi:10.1038/455847a • [2] A scaling normalization method for differential expression analysis of RNA-seq data Mark D Robinson1,2*, Alicia Oshlack1* • [3] Tumor Transcriptome Sequencing Reveals Allelic Expression Imbalances Associated with Copy Number Alterations. Brian B. Tuch1., Rebecca R. Laborde2., Xing Xu1, Jian Gu3, Christina B. Chung1, Cinna K. Monighetti1. • [4] Ultrafast and memory-efficient alignment of short DNA sequences to the human genome • Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg • [5] V. Costa, A. Casamassimi, and A. Ciccodicola, “Nutritional genomics era: opportunities toward a genome-tailored nutritional regimen,” The Journal of Nutritional Biochemistry, vol. 21, no. 6, pp. 457–467, 2010. • [6] E. Birney, J. A. Stamatoyannopoulos, A. Dutta, et al., “Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project,” Nature, vol. 447, no. 7146, pp. 799–816, 2007. • [7] F. S. Collins, E. S. Lander, J. Rogers, and R. H. Waterson, “Finishing the euchromatic sequence of the human genome,” Nature, vol. 431, no. 7011, pp. 931–945, 2004. • [8] International Human Genome Sequencing Consortium, “A haplotype map of the human genome,” Nature, vol. 437, no. 7063, pp. 1299–1320, 2005. 19-Apr-15 15