This document discusses differential expression analysis in RNA-Seq. It begins with an introduction that defines key concepts like expression levels, sequencing depth, and differential expression. It then covers normalization methods to account for biases in RNA-Seq data. The main method discussed is NOISeq, a non-parametric approach that does not require replicates. NOISeq compares signal distributions between conditions to noise distributions within conditions to identify differentially expressed genes. The document concludes with exercises to run NOISeq on sample data.
RNA Sequence data analysis,Transcriptome sequencing, Sequencing steady state RNA in a sample is known as RNA-Seq. It is free of limitations such as prior knowledge about the organism is not required.
RNA-Seq is useful to unravel inaccessible complexities of transcriptomics such as finding novel transcripts and isoforms.
Data set produced is large and complex; interpretation is not straight forward.
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
RNA Sequence data analysis,Transcriptome sequencing, Sequencing steady state RNA in a sample is known as RNA-Seq. It is free of limitations such as prior knowledge about the organism is not required.
RNA-Seq is useful to unravel inaccessible complexities of transcriptomics such as finding novel transcripts and isoforms.
Data set produced is large and complex; interpretation is not straight forward.
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
Today it is possible to obtain genome-wide transcriptome data from single cells using high-throughput sequencing (scRNA-seq). The main advantage of scRNA-seq is that the cellular resolution and the genome wide scope makes it possible to address issues that are intractable using other methods, e.g. bulk RNA-seq or single-cell RT-qPCR. However, to analyze scRNA-seq data, novel methods are required and some of the underlying assumptions for the methods developed for bulk RNA-seq experiments are no longer valid.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
RNA-seq for DE analysis: detecting differential expression - part 5BITS
Part 5 of the training sesson 'RNA-seq for differential expression analysis' considers the algorithm used for detecting differential expression between conditions. See http://www.bits.vib.be
Single cell analysis has exploded recently mainly due to the development of high-throughput technologies such as NGS. Single cell analysis is being pursued by researchers in many areas including developmental science, cancer, biomarker discovery and more. This presentation covers some of the recent applications from developed by QIAGEN customers.
Part 1 of RNA-seq for DE analysis: Defining the goalJoachim Jacob
First part of the training session 'RNA-seq for Differential expression' analysis. We explain how we can detect differential expression based on RNA-seq data. Interested in following this session? Please contact http://www.jakonix.be/contact.html
Today it is possible to obtain genome-wide transcriptome data from single cells using high-throughput sequencing (scRNA-seq). The main advantage of scRNA-seq is that the cellular resolution and the genome wide scope makes it possible to address issues that are intractable using other methods, e.g. bulk RNA-seq or single-cell RT-qPCR. However, to analyze scRNA-seq data, novel methods are required and some of the underlying assumptions for the methods developed for bulk RNA-seq experiments are no longer valid.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
RNA-seq for DE analysis: detecting differential expression - part 5BITS
Part 5 of the training sesson 'RNA-seq for differential expression analysis' considers the algorithm used for detecting differential expression between conditions. See http://www.bits.vib.be
Single cell analysis has exploded recently mainly due to the development of high-throughput technologies such as NGS. Single cell analysis is being pursued by researchers in many areas including developmental science, cancer, biomarker discovery and more. This presentation covers some of the recent applications from developed by QIAGEN customers.
Part 1 of RNA-seq for DE analysis: Defining the goalJoachim Jacob
First part of the training session 'RNA-seq for Differential expression' analysis. We explain how we can detect differential expression based on RNA-seq data. Interested in following this session? Please contact http://www.jakonix.be/contact.html
It contains information about- DNA Sequencing; History and Era sequencing; Next Generation Sequencing- Introduction, Workflow, Illumina/Solexa sequencing, Roche/454 sequencing, Ion Torrent sequencing, ABI-SOLiD sequencing; Comparison between NGS & Sangers and NGS Platforms; Advantages and Applications of NGS; Future Applications of NGS.
CD Genomics provides a fast, one-stop bacterial RNA sequencing solution from the quality control of sample to comprehensive data analysis. Please contact us for more information and a detailed quote.
DNA Fingerprinting & its techniques by Shiv Kalia (M.Pharma in Analytical Che...Shiv Kalia
DNA fingerprinting and below mention content widely cover in this presentation
History & Introduction of DNA fingerprinting
How was the first DNA fingerprint produced?
Types of DNA Based Markers
Polymerase Chain Reaction (PCR)
PCR based Methodology of DNA fingerprinting
Electrophoresis
Utility of DNA Based Markers
Various DNA Fingerprinting Techniques Advantages & Disadvantages
Authentication of Various Ayurvedic Herbs by DNA Fingerprinting
Advantages of DNA fingerprinting in Plants
Disadvantages of DNA fingerprinting in Plants
CONCLUSION
Course: Bioinformatics for Biomedical Research (2014).
Session: 4.1- Introduction to RNA-seq and RNA-seq Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
A DNA microarray (also commonly known as DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface.
The core principle behind microarrays is hybridization between two DNA strands, the property of complementary nucleic acid sequences to specifically pair with each other by forming hydrogen bonds between complementary nucleotide base pairs.
The study of the complete set of RNAs (transcriptome) encoded by the genome of a specific cell or organism at a specific time or under a specific set of conditions is called Transcriptomics.
Transcriptomics aims:
I. To catalogue all species of transcripts, including mRNAs, noncoding RNAs and small RNAs.
II. To determine the transcriptional structure of genes, in terms of their start sites, 5′ and 3′ ends, splicing patterns and other post-transcriptional modifications.
III. To quantify the changing expression levels of each transcript during development and under different conditions.
Similar to Differential expression in RNA-Seq (20)
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 6
Differential expression in RNA-Seq
1. Introduction
Normalization
Differential Expression
Differential Expression in RNA-Seq
Sonia Tarazona
starazona@cipf.es
Data analysis workshop for massive sequencing data
Granada, June 2011
Sonia Tarazona Differential Expression in RNA-Seq
3. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
Outline
1 Introduction
Some questions
Some definitions
RNA-seq expression data
2 Normalization
3 Differential Expression
Sonia Tarazona Differential Expression in RNA-Seq
4. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
Some questions
How do we measure expression?
What is differential expression?
Experimental design in RNA-Seq
Can we used the same statistics as in microarrays?
Do I need any “normalization”?
Sonia Tarazona Differential Expression in RNA-Seq
5. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
Some definitions
Expression level
RNA-Seq: The number of reads (counts) mapping to the biological feature of
interest (gene, transcript, exon, etc.) is considered to be linearly
related to the abundance of the target feature.
Microarrays: The abundance of each sequence is a function of the fluorescence
level recovered after the hybridization process.
Sonia Tarazona Differential Expression in RNA-Seq
6. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
Some definitions
Sequencing depth: Total number of reads mapped to the genome. Library size.
Gene length: Number of bases.
Gene counts: Number of reads mapping to that gene (expression measurement).
Sonia Tarazona Differential Expression in RNA-Seq
7. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
Some definitions
What is differential expression?
A gene is declared differentially expressed if an observed difference or change in
read counts between two experimental conditions is statistically significant, i.e.
whether it is greater than what would be expected just due to natural random
variation.
Statistical tools are needed to make such a decision by studying counts
probability distributions.
Sonia Tarazona Differential Expression in RNA-Seq
8. Introduction Some questions
Normalization Some definitions
Differential Expression RNA-seq expression data
RNA-Seq expression data
Experimental design
Pairwise comparisons: Only two experimental conditions or groups are to be
compared.
Multiple comparisons: More than two conditions or groups.
Replication
Biological replicates. To draw general conclusions: from samples to population.
Technical replicates. Conclusions are only valid for compared samples.
Sonia Tarazona Differential Expression in RNA-Seq
10. Introduction Why?
Normalization Methods
Differential Expression Examples
Why Normalization?
RNA-seq biases
Influence of sequencing depth: The higher sequencing depth, the higher counts.
Sonia Tarazona Differential Expression in RNA-Seq
11. Introduction Why?
Normalization Methods
Differential Expression Examples
Why Normalization?
RNA-seq biases
Dependence on gene length: Counts are proportional to the transcript length
times the mRNA expression level.
Sonia Tarazona Differential Expression in RNA-Seq
12. Introduction Why?
Normalization Methods
Differential Expression Examples
Why Normalization?
RNA-seq biases
Differences on the counts distribution among samples.
Sonia Tarazona Differential Expression in RNA-Seq
13. Introduction Why?
Normalization Methods
Differential Expression Examples
Why Normalization?
RNA-seq biases
Influence of sequencing depth: The higher sequencing depth, the higher counts.
Dependence on gene length: Counts are proportional to the transcript length
times the mRNA expression level.
Differences on the counts distribution among samples.
Options
1 Normalization: Counts should be previously corrected in order to minimize these
biases.
2 Statistical model should take them into account.
Sonia Tarazona Differential Expression in RNA-Seq
14. Introduction Why?
Normalization Methods
Differential Expression Examples
Some Normalization Methods
RPKM (Mortazavi et al., 2008): Counts are divided by the transcript length
(kb) times the total number of millions of mapped reads.
number of reads of the region
RPKM =
total reads × region length
1000000 1000
Upper-quartile (Bullard et al., 2010): Counts are divided by upper-quartile of
counts for transcripts with at least one read.
TMM (Robinson and Oshlack, 2010): Trimmed Mean of M values.
Quantiles, as in microarray normalization (Irizarry et al., 2003).
FPKM (Trapnell et al., 2010): Instead of counts, Cufflinks software generates
FPKM values (Fragments Per Kilobase of exon per Million fragments mapped)
to estimate gene expression, which are analogous to RPKM.
Sonia Tarazona Differential Expression in RNA-Seq
18. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Differential Expression
Parametric approaches
Counts are modeled using known probability distributions such as Binomial, Poisson,
Negative Binomial, etc.
R packages in Bioconductor:
edgeR (Robinson et al., 2010): Exact test based on Negative Binomial
distribution.
DESeq (Anders and Huber, 2010): Exact test based on Negative Binomial
distribution.
DEGseq (Wang et al., 2010): MA-plots based methods (MATR and MARS),
assuming Normal distribution for M|A.
baySeq (Hardcastle et al., 2010): Estimation of the posterior likelihood of
differential expression (or more complex hypotheses) via empirical Bayesian
methods using Poisson or NB distributions.
Sonia Tarazona Differential Expression in RNA-Seq
19. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Differential Expression
Non-parametric approaches
No assumptions about data distribution are made.
Fisher’s exact test (better with normalized counts).
cuffdiff (Trapnell et al., 2010): Based on entropy divergence for relative
transcript abundances. Divergence is a measurement of the ”distance”between
the relative abundances of transcripts in two difference conditions.
NOISeq (Tarazona et al., coming soon) −→ http://bioinfo.cipf.es/noiseq/
Sonia Tarazona Differential Expression in RNA-Seq
20. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Differential Expression
Drawbacks of differential expression methods
Parametric assumptions: Are they fulfilled?
Need of replicates.
Problems to detect differential expression in genes with low counts.
Sonia Tarazona Differential Expression in RNA-Seq
21. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Differential Expression
Drawbacks of differential expression methods
Parametric assumptions: Are they fulfilled?
Need of replicates.
Problems to detect differential expression in genes with low counts.
NOISeq
Non-parametric method.
No need of replicates.
Less influenced by sequencing depth or number of counts.
Sonia Tarazona Differential Expression in RNA-Seq
22. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq
Outline
Signal distribution. Computing changes in expression of each gene between the
two experimental conditions.
Noise distribution. Distribution of changes in expression values when comparing
replicates within the same condition.
Differential expression. Comparing signal and noise distributions to determine
differentially expressed genes.
Sonia Tarazona Differential Expression in RNA-Seq
23. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq
NOISeq-real
Replicates are available for each condition.
Compute M-D in noise by comparing each pair of replicates within the same
condition.
Sonia Tarazona Differential Expression in RNA-Seq
24. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq
NOISeq-real
Replicates are available for each condition.
Compute M-D in noise by comparing each pair of replicates within the same
condition.
NOISeq-sim
No replicates are available at all.
NOISeq simulates technical replicates for each condition. The replicates are
generated from a multinomial distribution taking the counts in the only sample
as the probabilities for the distribution.
Compute M-D in noise by comparing each pair of simulated replicates within the
same condition.
Sonia Tarazona Differential Expression in RNA-Seq
25. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq: Signal distribution
1 Calculate the expression of each gene at each experimental condition
(expressioni , for i = 1, 2).
Technical replicates: expressioni = sum(valuesi )
Biological replicates: expressioni = mean(valuesi )
No replicates: expressioni = valuei
2 Compute for each gene the statistics measuring changes in expression:
expression
M = log2 expression1
2
D = |expression1 –expression2 |
Sonia Tarazona Differential Expression in RNA-Seq
26. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq: Noise distribution
With replicates (NOISeq-real): For each condition, all the possible comparisons
among the available replicates are used to compute M and D values. All the
M-D values for all the comparisons, genes and for both conditions are pooled
together to create the noise distribution.
If the number of pairwise comparisons for a certain condition is higher than 30,
only 30 randomly selected comparisons are made.
Without replicates (NOISeq-sim): Replicates are simulated and the same
procedure is used to derive noise distribution.
Sonia Tarazona Differential Expression in RNA-Seq
27. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq: Differential expression
Probability for each gene of being differentially expressed: It is obtained by
comparing M-D values of that gene against noise distribution and computing the
number of cases in which the values in noise are lower than the values for signal.
A gene is declared as differentially expressed if this probability is higher than q.
The threshold q is set to 0.8 by default, since this value is equivalent to an odds
of 4:1 (the gene in 4 times more likely to be differentially expressed than not).
Sonia Tarazona Differential Expression in RNA-Seq
28. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq
Input
Data: datos1, datos2
Features length: long (only if length correction is to be applied)
Normalization: norm = {“rpkm”, “uqua”, “tmm”, “none”}; lc = “length
correction”
Replicates: repl = {“tech”, “bio”}
Simulation: nss = “number of replicates to be simulated”; pnr = “total counts
in each simulated replicate”; v = variability for pnr
Probability cutoff: q (≥ 0,8)
Others: k = 0,5
Sonia Tarazona Differential Expression in RNA-Seq
29. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
NOISeq
Output
Differential expression probability. For each feature, probability of being
differentially expressed.
Differentially expressed features. List of features names which are differentially
expressed according to q cutoff.
M-D values. For signal (between conditions and for each feature) and for noise
(among replicates within the same condition, pooled).
Sonia Tarazona Differential Expression in RNA-Seq
30. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Exercises
Execute in an R terminal the code provided in exerciseDEngs.r
Reading data
simCount <- readData(file = "simCount.txt", cond1 = c(2:6),
cond2 = c(7:11), header = TRUE)
lapply(simCount, head)
depth <- as.numeric(sapply(simCount, colSums))
Differential Expression by NOISeq
NOISeq-real
res1noiseq <- noiseq(simCount[[1]], simCount[[2]], nss = 0, q = 0.8,
repl = ‘‘tech’’)
NOISeq-sim
res2noiseq <- noiseq(as.matrix(rowSums(simCount[[1]])),
as.matrix(rowSums(simCount[[2]])), q = 0.8, pnr = 0.2, nss = 5)
See the tutorial in http://bioinfo.cipf.es/noiseq/ for more information.
Sonia Tarazona Differential Expression in RNA-Seq
31. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
Some remarks
Experimental design is decisive to answer correctly your biological questions.
Differential expression methods for RNA-Seq data must be different to
microarray methods.
Normalization should be applied to raw counts, at least a library size correction.
Sonia Tarazona Differential Expression in RNA-Seq
32. Methods
Introduction
NOISeq
Normalization
Exercises
Differential Expression
Concluding
For Further Reading
Oshlack, A., Robinson, M., and Young, M. (2010) From RNA-seq reads to
differential expression results. Genome Biology , 11, 220+.
Review on RNA-Seq, including differential expression.
Auer, P.L., and Doerge R.W. (2010) Statistical Design and Analysis of RNA
Sequencing Data. Genetics, 185, 405-416.
Bullard, J. H., Purdom, E., Hansen, K. D., and Dudoit, S. (2010) Evaluation of
statistical methods for normalization and differential expression in mRNA-Seq
experiments. BMC Bioinformatics, 11, 94+.
Normalization methods (including Upper Quartile) and differential expression.
Tarazona, S., Garc´
ıa-Alcalde, F., Dopazo, J., Ferrer A., and Conesa, A.
(submitted) Differential expression in RNA-seq: a matter of depth.
Sonia Tarazona Differential Expression in RNA-Seq