SlideShare a Scribd company logo
1 of 7
Analyzing Gene Expression Data from Two Corn Strains

In this example, we analyzed a gene expression dataset (GSE16567) of two corn (Zea mays) strains, the
drought tolerant line Han 21 and the drought sensitive Ye 478. In the study, each strain was subjected
to four watering regimes: moderate drought, severe drought, re-watering, and control. Transcriptome
expression for each treatment was measured using an Affymetrix Maize Genome Array, providing
intensity scores that are proxies for measuring gene-expression. Below, we demonstrate how we can
use Iris to explore patterns of expression across treatment groups.


Data structure: GSE16567, 24 samples (drought treatment groups) and 17,621 gene features (Affymetrix
intensity values in RMA format).


Exploring drought treatment groups in Sample Space networks.


In our first exploration of the data, we structured the data with the samples as rows and genes as
columns, in Iris terminology this is the sample space format. In the resulting network, each node is a
collection of samples that are similar in their gene expression. 




Figure 1. Network constructed using Norm Correlation metric and Principle and Secondary Metric Embedded SVD lenses. Red
depicts the Ye 478 strain (drought sensitive) and blue shades depict the Han 21 strain (drought resistant).




                                             Copyright 2013 Ayasdi Inc.
Using the Quick Analysis feature to generate the network above, we see that Iris quickly segregates
the Han 21 strain from the Ye 478 strain in an entirely unsupervised manner. Additionally, notice how
Iris readily distinguishes drought treatments within each strain.


Another view of the data is provided by changing one of the lenses used in network construction. This
new network now resembles a butterfly, but the insight provided is still the same — a clear distinction
between the four watering regimens and the two strains. Changing the mathematical lens provides a
different view of the same data and provides more details. If we use the Find feature to search for
samples from each treatment group, each lobe of the butterfly’s wings is predominantly data points
corresponding to treatment types.




Figure 2. Network constructed using Norm Correlation metric, L-infinity Centrality and Principle Metric Embedded SVD as
lenses. Each image of the network is colored by the Find function, identifying nodes populated by samples corresponding to a
specific treatment. Red nodes represent those which are the most enriched with probe sets for each treatment of interest.


                                               Copyright 2013 Ayasdi Inc.
The different treatments form distinct parts of the network; we can use the Explain feature to identify
the genes that determine those groups. Theexplain function uses the two-sample Kolmogorov-Smirnov
test to identify the gene features (columns) that most distinguish each treatment group. In the table
below, we list a subset of the most distinguishing genes for each treatment.




Exploring differential gene expression in Gene Space networks
                                       Copyright 2013 Ayasdi Inc.
We can also look at how Iris builds networks in feature space. In our feature space, rows represent
different genes, and columns represent different treatment groups. Each node in this network is a
collection of genes, the expression of which is similar across treatment types.




Figure 3. This is an example of a gene space network constructed using correlation as a metric and L-infinity centrality and
Gaussian density as our lenses. It is colored by L-infinity centrality, which measures how central a row is in relation to the rest
of the data set. Here, blue colors are more central and red colors are less.


Pathway Analysis


Iris’s pathway analysis allows you to use public and user-provided annotation for genes, or probe IDs,
that explain interesting shapes, or regions of our Iris network.


Using the “severe drought” region of our sample space network (Figure 2.), we found the probes in
the gene feature space network that most distinguished the severe drought group. The picture below
demonstrates where those probes exist in our gene space.




                                                  Copyright 2013 Ayasdi Inc.
Figure 4. Genes previously identified in sample space now represented in gene space. Red nodes are the most enriched with
probe sets we previously identified in sample space.


We conduct Pathway analysis by selecting the the most enriched nodes (pure red) for the severe
drought probe IDs in our gene space network. To select the most enriched nodes, we use the
histogram feature, highlighting the upper end of the distribution.


Iris’s pathway feature returns annotation from Gene Ontology (GO) pathways, NCI cancer pathways,
Entrez Gene pathways, and many others. Iris also provides the corresponding probe IDs and a link to
the database that describes each pathway. For our analysis, we chose the GO Process pathway, but
there are also other GO annotations available, which are not shown here — such as GO process and GO
component.


The table below is an example of the output from our pathways analysis. These pathways are
mapped from Zea mays to Arabidopsis. From our selection of nodes from the severe drought
treatment, we have identified pathways that are most highly represented in our selection. Notice
that the severe drought group corresponds closely with several pathways that involved in
osmoregulation.

                                              Copyright 2013 Ayasdi Inc.
Meaningful color using the contrast wand


Exploring the structure of this network identifies regions of genes with similar expression. Using the
contrast wand feature allows us to examine relative gene expression, between groups, as colors



                                       Copyright 2013 Ayasdi Inc.
within our network. Contrast wand is an extremely useful tool that allows researchers to identify
probes that show considerable changes in expression between treatment groups.


The figure below uses the contrast wand to depict changes in expression between corn strains, under
different drought conditions, relative to their experimental controls. Notice that there is a more
extreme change in color intensity of the drought sensitive strain (Ye 478) than the drought resistant
strain (Han 21).




Figure 5. In this data set, probe intensities are log2-normalized. We fixed the range of our color scheme to values between -1
and 1, highlighting expression that has changed at least two-fold. Solid red nodes represent genes that show at least a two
fold increase in expression, and solid blue nodes represent genes that show at least a two-fold decrease in expression.




                                                Copyright 2013 Ayasdi Inc.

More Related Content

Similar to Analyzing Corn Gene Expression Data

Gene Selection for Sample Classification in Microarray: Clustering Based Method
Gene Selection for Sample Classification in Microarray: Clustering Based MethodGene Selection for Sample Classification in Microarray: Clustering Based Method
Gene Selection for Sample Classification in Microarray: Clustering Based MethodIOSR Journals
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...rahulmonikasharma
 
Microarray and its application
Microarray and its applicationMicroarray and its application
Microarray and its applicationprateek kumar
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classificationperfj
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloAlexander Pico
 
On the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood modelOn the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood modelArrigo Coen
 
Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...N. Sukumar
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...A Review of Various Methods Used in the Analysis of Functional Gene Expressio...
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...ijitcs
 
Disease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple TreeDisease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple Treeijtsrd
 
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ijaia
 
jin-HMG2014-post
jin-HMG2014-postjin-HMG2014-post
jin-HMG2014-postJin Yu
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceIJSTA
 
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6IRJET Journal
 
Christopher Johnson Bachelor's Thesis
Christopher Johnson Bachelor's ThesisChristopher Johnson Bachelor's Thesis
Christopher Johnson Bachelor's ThesisBagpipesJohnson
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genapnooriasukmaningtyas
 
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONCOMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONcsandit
 

Similar to Analyzing Corn Gene Expression Data (20)

Gene Selection for Sample Classification in Microarray: Clustering Based Method
Gene Selection for Sample Classification in Microarray: Clustering Based MethodGene Selection for Sample Classification in Microarray: Clustering Based Method
Gene Selection for Sample Classification in Microarray: Clustering Based Method
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
 
Microarray and its application
Microarray and its applicationMicroarray and its application
Microarray and its application
 
Survey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue ClassificationSurvey and Evaluation of Methods for Tissue Classification
Survey and Evaluation of Methods for Tissue Classification
 
NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chrisevelo
 
On the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood modelOn the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood model
 
Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...A Review of Various Methods Used in the Analysis of Functional Gene Expressio...
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...
 
Disease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple TreeDisease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple Tree
 
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...
 
jin-HMG2014-post
jin-HMG2014-postjin-HMG2014-post
jin-HMG2014-post
 
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORKCLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
 
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
 
Christopher Johnson Bachelor's Thesis
Christopher Johnson Bachelor's ThesisChristopher Johnson Bachelor's Thesis
Christopher Johnson Bachelor's Thesis
 
Gene Array Analyzer
Gene Array AnalyzerGene Array Analyzer
Gene Array Analyzer
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap
 
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONCOMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Analyzing Corn Gene Expression Data

  • 1. Analyzing Gene Expression Data from Two Corn Strains In this example, we analyzed a gene expression dataset (GSE16567) of two corn (Zea mays) strains, the drought tolerant line Han 21 and the drought sensitive Ye 478. In the study, each strain was subjected to four watering regimes: moderate drought, severe drought, re-watering, and control. Transcriptome expression for each treatment was measured using an Affymetrix Maize Genome Array, providing intensity scores that are proxies for measuring gene-expression. Below, we demonstrate how we can use Iris to explore patterns of expression across treatment groups. Data structure: GSE16567, 24 samples (drought treatment groups) and 17,621 gene features (Affymetrix intensity values in RMA format). Exploring drought treatment groups in Sample Space networks. In our first exploration of the data, we structured the data with the samples as rows and genes as columns, in Iris terminology this is the sample space format. In the resulting network, each node is a collection of samples that are similar in their gene expression.  Figure 1. Network constructed using Norm Correlation metric and Principle and Secondary Metric Embedded SVD lenses. Red depicts the Ye 478 strain (drought sensitive) and blue shades depict the Han 21 strain (drought resistant). Copyright 2013 Ayasdi Inc.
  • 2. Using the Quick Analysis feature to generate the network above, we see that Iris quickly segregates the Han 21 strain from the Ye 478 strain in an entirely unsupervised manner. Additionally, notice how Iris readily distinguishes drought treatments within each strain. Another view of the data is provided by changing one of the lenses used in network construction. This new network now resembles a butterfly, but the insight provided is still the same — a clear distinction between the four watering regimens and the two strains. Changing the mathematical lens provides a different view of the same data and provides more details. If we use the Find feature to search for samples from each treatment group, each lobe of the butterfly’s wings is predominantly data points corresponding to treatment types. Figure 2. Network constructed using Norm Correlation metric, L-infinity Centrality and Principle Metric Embedded SVD as lenses. Each image of the network is colored by the Find function, identifying nodes populated by samples corresponding to a specific treatment. Red nodes represent those which are the most enriched with probe sets for each treatment of interest. Copyright 2013 Ayasdi Inc.
  • 3. The different treatments form distinct parts of the network; we can use the Explain feature to identify the genes that determine those groups. Theexplain function uses the two-sample Kolmogorov-Smirnov test to identify the gene features (columns) that most distinguish each treatment group. In the table below, we list a subset of the most distinguishing genes for each treatment. Exploring differential gene expression in Gene Space networks Copyright 2013 Ayasdi Inc.
  • 4. We can also look at how Iris builds networks in feature space. In our feature space, rows represent different genes, and columns represent different treatment groups. Each node in this network is a collection of genes, the expression of which is similar across treatment types. Figure 3. This is an example of a gene space network constructed using correlation as a metric and L-infinity centrality and Gaussian density as our lenses. It is colored by L-infinity centrality, which measures how central a row is in relation to the rest of the data set. Here, blue colors are more central and red colors are less. Pathway Analysis Iris’s pathway analysis allows you to use public and user-provided annotation for genes, or probe IDs, that explain interesting shapes, or regions of our Iris network. Using the “severe drought” region of our sample space network (Figure 2.), we found the probes in the gene feature space network that most distinguished the severe drought group. The picture below demonstrates where those probes exist in our gene space. Copyright 2013 Ayasdi Inc.
  • 5. Figure 4. Genes previously identified in sample space now represented in gene space. Red nodes are the most enriched with probe sets we previously identified in sample space. We conduct Pathway analysis by selecting the the most enriched nodes (pure red) for the severe drought probe IDs in our gene space network. To select the most enriched nodes, we use the histogram feature, highlighting the upper end of the distribution. Iris’s pathway feature returns annotation from Gene Ontology (GO) pathways, NCI cancer pathways, Entrez Gene pathways, and many others. Iris also provides the corresponding probe IDs and a link to the database that describes each pathway. For our analysis, we chose the GO Process pathway, but there are also other GO annotations available, which are not shown here — such as GO process and GO component. The table below is an example of the output from our pathways analysis. These pathways are mapped from Zea mays to Arabidopsis. From our selection of nodes from the severe drought treatment, we have identified pathways that are most highly represented in our selection. Notice that the severe drought group corresponds closely with several pathways that involved in osmoregulation. Copyright 2013 Ayasdi Inc.
  • 6. Meaningful color using the contrast wand Exploring the structure of this network identifies regions of genes with similar expression. Using the contrast wand feature allows us to examine relative gene expression, between groups, as colors Copyright 2013 Ayasdi Inc.
  • 7. within our network. Contrast wand is an extremely useful tool that allows researchers to identify probes that show considerable changes in expression between treatment groups. The figure below uses the contrast wand to depict changes in expression between corn strains, under different drought conditions, relative to their experimental controls. Notice that there is a more extreme change in color intensity of the drought sensitive strain (Ye 478) than the drought resistant strain (Han 21). Figure 5. In this data set, probe intensities are log2-normalized. We fixed the range of our color scheme to values between -1 and 1, highlighting expression that has changed at least two-fold. Solid red nodes represent genes that show at least a two fold increase in expression, and solid blue nodes represent genes that show at least a two-fold decrease in expression. Copyright 2013 Ayasdi Inc.