Step by step tutorial for conducting GO enrichment analysis and then creating a network from the results.
Material from the UC Davis 2014 Proteomics Workshop.
See more at: http://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/
Metabolomic data analysis and visualization toolsDmitry Grapov
Ā
A description of data analysis and visualization tools for metabolomic and other high dimensional data sets, developed at the NIH West Coast Metabolomics Center.
Step by step tutorial for conducting GO enrichment analysis and then creating a network from the results.
Material from the UC Davis 2014 Proteomics Workshop.
See more at: http://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/
Metabolomic data analysis and visualization toolsDmitry Grapov
Ā
A description of data analysis and visualization tools for metabolomic and other high dimensional data sets, developed at the NIH West Coast Metabolomics Center.
https://www.youtube.com/watch?v=Y_-o-4rKxUk
Machine learning powered metabolomic network analysis
Dmitry Grapov PhD,
Director of Data Science and Bioinformatics,
CDS- Creative Data Solutions
www.createdatasol.com
Metabolomic network analysis can be used to interpret experimental results within a variety of contexts including: biochemical relationships, structural and spectral similarity and empirical correlation. Machine learning is useful for modeling relationships in the context of pattern recognition, clustering, classification and regression based predictive modeling. The combination of developed metabolomic networks and machine learning based predictive models offer a unique method to visualize empirical relationships while testing key experimental hypotheses. The following presentation focuses on data analysis, visualization, machine learning and network mapping approaches used to create richly mapped metabolomic networks. Learn more at www.createdatasol.com
Model organisms such as budding yeast provide a common platform to interrogate and understand cellular and physiological processes. Knowledge about model organisms, whether generated during the course of scientific investigation, or extracted from published articles, are made available by model organism databases (MODs) such as the Saccharomyces Genome Database (SGD) for powerful, data-driven bioinformatic analyses. Integrative platforms such as InterMine offer a standard platform for MOD data exploration and data mining. Yet, todayās bioinformatic analyses also requires access to a significantly broader set of structured biomedical data, such as what can be found in the emerging network of Linked Open Data (LOD). If MOD data could be provisioned as FAIR (Findable, Accessible, Interoperable, and Reusable), then scientists could leverage a greater amount of interoperable data in knowledge discovery.
The goal of this proposal is to increase the utility of MOD data by implementing standards-compliant data access interfaces that interoperate with Linked Data. We will focus our efforts on developing interfaces for data access, data retrieval, and query answering for SGD. Our software will publish InterMine data as LOD that are semantically annotated with ontologies and be retrieved using standardized formats (e.g. JSON-LD, Turtle). We will facilitate the exploration of MOD data for hypothesis testing, by implementing efficient query answering using Linked Data Fragments, and by developing a set of graphical user interfaces to search for data of interest, explore connections, and answer questions that leverage the wider LOD network. Finally, we will develop a locally and cloud-deployable image to enable the rapid deployment of the proposed infrastructure. Our efforts to increase interoperability and ease of deployment for biomedical data repositories will increase research productivity and reduce costs associated with data integration and warehouse maintenance.
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...SOYEON KIM
Ā
17th Annual International Conference on Critical Assessment of Massive Data Analysis (CAMDA 2018)
Cancer Data Integration Challenge (http://camda.info/)
Deep learning based multi-omics integration, a surveySOYEON KIM
Ā
1. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders, Pacific Symposium on Biocomputing, 2015
2. A deep learning approach for cancer detection and relevant gene identification, Pacific Symposium on Biocomputing, 2016
3. Deep Learning based multi-omics integrationrobustly predicts survival in liver cancer, preprint, 2017
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
Ā
Biomedical researchers will remain stymied in their ability to take full advantage of the Big Data revolution if they can never find the datasets that they need to analyze, if there is lack of clarity about what particular datasets contain, and if data are insufficiently described.
CEDAR, an NIH BD2K Center of Excellence, aims to develop methods and tools to vastly ease the burden of authoring good experimental metadata, and to maximally use this information to zero in on datasets of interest.
Semantic web technologies offer a potential mechanism for the representation and integration of thousands of biomedical databases. Many of these databases offer cross-references to other data sources, but these are generally incomplete and prone to error. In this paper, we conduct an empirical analysis of the link structure of life science Linked Data, obtained from the Bio2RDF project. Three different link graphs for datasets, entities and terms are characterized by degree, connectivity, and clustering metrics, and their correlation is measured as well. Furthermore, we utilize the symmetry and transitivity of entity links to build a benchmark and evaluate several popular entity matching approaches. Our findings indicate that the life science data network can help find hidden links, can be used to validate links, and may offer a mechanism to integrate a wider set of resources to support biomedical knowledge discovery.
With its focus on investigating the basis for the sustained existence
of living systems, modern biology has always been a fertile, if not
challenging, domain for formal knowledge representation and automated
reasoning. With thousands of databases and hundreds of ontologies now
available, there is a salient opportunity to integrate these for
discovery. In this talk, I will discuss our efforts to build a rich
foundational network of ontology-annotated linked data, develop
methods to intelligently retrieve content of interest, uncover
significant biological associations, and pursue new avenues for drug
discovery. As the portfolio of Semantic Web technologies continue to
mature in terms of functionality, scalability, and an understanding of
how to maximize their value, researchers will be strategically poised
to pursue increasingly sophisticated KR projects aimed at improving
our overall understanding of human health and disease.
bio: Dr. Michel Dumontier is an Associate Professor of Medicine
(Biomedical Informatics) at Stanford University. His research aims to
find new treatments for rare and complex diseases. His research
interest lie in the publication, integration, and discovery of
scientific knowledge. Dr. Dumontier serves as a co-chair for the World
Wide Web Consortium Semantic Web in Health Care and Life Sciences
Interest Group (W3C HCLSIG) and is the Scientific Director for
Bio2RDF, a widely used open-source project to create and provide
linked data for life sciences.
Harnessing The Proteome With Proteo Iq Quantitative Proteomics Softwarejatwood3
Ā
Learn how successful researchers are using ProteoIQ to streamline their proteomic data analysis.
Centralize data analysis on a single software platform
Most laboratories have multiple MS platforms with different software packages. ProteoIQ simplifies data analysis as a vendor independent software platform supporting qualitative and quantitative analysis.
Learn how to achieve robust peptide and protein quantification
ProteoIQ is the only commercial software platform supporting all popular forms of quantification. Learn how ProteoIQ performs protein and peptide quantification using isobaric tags, isotopic labels and label free methods including intensity based peptide profiling.
Elucidate biological significance
Learn how to integrate biological databases with ProteoIQ. Quickly move from MS results to the discovery of novel biological insights through an integrated biological annotation pipeline.
Prote-OMIC Data Analysis and VisualizationDmitry Grapov
Ā
Introductory lecture to multivariate analysis of proteomic data.
Material from the UC Davis 2014 Proteomics Workshop.
See more at: http://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/
Data Normalization Approaches for Large-scale Biological StudiesDmitry Grapov
Ā
Overview of how to estimate data quality and validate normalization approaches to remove analytical variance.
See here for animations used in the presentation:
http://imdevsoftware.wordpress.com/2014/06/04/using-repeated-measures-to-remove-artifacts-from-longitudinal-data/
https://www.youtube.com/watch?v=Y_-o-4rKxUk
Machine learning powered metabolomic network analysis
Dmitry Grapov PhD,
Director of Data Science and Bioinformatics,
CDS- Creative Data Solutions
www.createdatasol.com
Metabolomic network analysis can be used to interpret experimental results within a variety of contexts including: biochemical relationships, structural and spectral similarity and empirical correlation. Machine learning is useful for modeling relationships in the context of pattern recognition, clustering, classification and regression based predictive modeling. The combination of developed metabolomic networks and machine learning based predictive models offer a unique method to visualize empirical relationships while testing key experimental hypotheses. The following presentation focuses on data analysis, visualization, machine learning and network mapping approaches used to create richly mapped metabolomic networks. Learn more at www.createdatasol.com
Model organisms such as budding yeast provide a common platform to interrogate and understand cellular and physiological processes. Knowledge about model organisms, whether generated during the course of scientific investigation, or extracted from published articles, are made available by model organism databases (MODs) such as the Saccharomyces Genome Database (SGD) for powerful, data-driven bioinformatic analyses. Integrative platforms such as InterMine offer a standard platform for MOD data exploration and data mining. Yet, todayās bioinformatic analyses also requires access to a significantly broader set of structured biomedical data, such as what can be found in the emerging network of Linked Open Data (LOD). If MOD data could be provisioned as FAIR (Findable, Accessible, Interoperable, and Reusable), then scientists could leverage a greater amount of interoperable data in knowledge discovery.
The goal of this proposal is to increase the utility of MOD data by implementing standards-compliant data access interfaces that interoperate with Linked Data. We will focus our efforts on developing interfaces for data access, data retrieval, and query answering for SGD. Our software will publish InterMine data as LOD that are semantically annotated with ontologies and be retrieved using standardized formats (e.g. JSON-LD, Turtle). We will facilitate the exploration of MOD data for hypothesis testing, by implementing efficient query answering using Linked Data Fragments, and by developing a set of graphical user interfaces to search for data of interest, explore connections, and answer questions that leverage the wider LOD network. Finally, we will develop a locally and cloud-deployable image to enable the rapid deployment of the proposed infrastructure. Our efforts to increase interoperability and ease of deployment for biomedical data repositories will increase research productivity and reduce costs associated with data integration and warehouse maintenance.
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...SOYEON KIM
Ā
17th Annual International Conference on Critical Assessment of Massive Data Analysis (CAMDA 2018)
Cancer Data Integration Challenge (http://camda.info/)
Deep learning based multi-omics integration, a surveySOYEON KIM
Ā
1. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders, Pacific Symposium on Biocomputing, 2015
2. A deep learning approach for cancer detection and relevant gene identification, Pacific Symposium on Biocomputing, 2016
3. Deep Learning based multi-omics integrationrobustly predicts survival in liver cancer, preprint, 2017
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
Ā
Biomedical researchers will remain stymied in their ability to take full advantage of the Big Data revolution if they can never find the datasets that they need to analyze, if there is lack of clarity about what particular datasets contain, and if data are insufficiently described.
CEDAR, an NIH BD2K Center of Excellence, aims to develop methods and tools to vastly ease the burden of authoring good experimental metadata, and to maximally use this information to zero in on datasets of interest.
Semantic web technologies offer a potential mechanism for the representation and integration of thousands of biomedical databases. Many of these databases offer cross-references to other data sources, but these are generally incomplete and prone to error. In this paper, we conduct an empirical analysis of the link structure of life science Linked Data, obtained from the Bio2RDF project. Three different link graphs for datasets, entities and terms are characterized by degree, connectivity, and clustering metrics, and their correlation is measured as well. Furthermore, we utilize the symmetry and transitivity of entity links to build a benchmark and evaluate several popular entity matching approaches. Our findings indicate that the life science data network can help find hidden links, can be used to validate links, and may offer a mechanism to integrate a wider set of resources to support biomedical knowledge discovery.
With its focus on investigating the basis for the sustained existence
of living systems, modern biology has always been a fertile, if not
challenging, domain for formal knowledge representation and automated
reasoning. With thousands of databases and hundreds of ontologies now
available, there is a salient opportunity to integrate these for
discovery. In this talk, I will discuss our efforts to build a rich
foundational network of ontology-annotated linked data, develop
methods to intelligently retrieve content of interest, uncover
significant biological associations, and pursue new avenues for drug
discovery. As the portfolio of Semantic Web technologies continue to
mature in terms of functionality, scalability, and an understanding of
how to maximize their value, researchers will be strategically poised
to pursue increasingly sophisticated KR projects aimed at improving
our overall understanding of human health and disease.
bio: Dr. Michel Dumontier is an Associate Professor of Medicine
(Biomedical Informatics) at Stanford University. His research aims to
find new treatments for rare and complex diseases. His research
interest lie in the publication, integration, and discovery of
scientific knowledge. Dr. Dumontier serves as a co-chair for the World
Wide Web Consortium Semantic Web in Health Care and Life Sciences
Interest Group (W3C HCLSIG) and is the Scientific Director for
Bio2RDF, a widely used open-source project to create and provide
linked data for life sciences.
Harnessing The Proteome With Proteo Iq Quantitative Proteomics Softwarejatwood3
Ā
Learn how successful researchers are using ProteoIQ to streamline their proteomic data analysis.
Centralize data analysis on a single software platform
Most laboratories have multiple MS platforms with different software packages. ProteoIQ simplifies data analysis as a vendor independent software platform supporting qualitative and quantitative analysis.
Learn how to achieve robust peptide and protein quantification
ProteoIQ is the only commercial software platform supporting all popular forms of quantification. Learn how ProteoIQ performs protein and peptide quantification using isobaric tags, isotopic labels and label free methods including intensity based peptide profiling.
Elucidate biological significance
Learn how to integrate biological databases with ProteoIQ. Quickly move from MS results to the discovery of novel biological insights through an integrated biological annotation pipeline.
Prote-OMIC Data Analysis and VisualizationDmitry Grapov
Ā
Introductory lecture to multivariate analysis of proteomic data.
Material from the UC Davis 2014 Proteomics Workshop.
See more at: http://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/
Data Normalization Approaches for Large-scale Biological StudiesDmitry Grapov
Ā
Overview of how to estimate data quality and validate normalization approaches to remove analytical variance.
See here for animations used in the presentation:
http://imdevsoftware.wordpress.com/2014/06/04/using-repeated-measures-to-remove-artifacts-from-longitudinal-data/
Automation of (Biological) Data Analysis and Report GenerationDmitry Grapov
Ā
I've been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered.
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
Ā
Get more information:
http://imdevsoftware.wordpress.com/2014/10/11/2014-metabolomic-data-analysis-and-visualization-workshop-and-tutorials/
Recently I had the pleasure of teaching statistical and multivariate data analysis and visualization at the annual Summer Sessions in Metabolomics 2014, organized by the NIH West Coast Metabolomics Center.
Similar to last year, Iāve posted all the content (lectures, labs and software) for any one to follow along with at their own pace. I also plan to release videos for all the lectures and labs.
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
Ā
EnrichNet is a web-application and web-service to identify and visualize functional associations between a user-defined list of genes/proteins and known cellular pathways. As a complement to classical overlap-based enrichment analysis methods, the EnrichNet approach integrates a novel graph-based statistic with a new interactive visualization of network sub-structures to enable a direct molecular interpretation of how a set of genes or proteins is related to a specific cellular pathway. Available at: http://www.enrichnet.org
O.M.GSEA - An in-depth introduction to gene-set enrichment analysisShana White
Ā
An comprehensive overview of 'classic' gene-set enrichment analysis that was presented for a Biostatistics/Bioinformatics divisional seminar. Supplemental slides (58+) include details for running GSEA with a variety of options (GUI, R script, R package)
Short tutorials on how to use the web-based tool DAVID - Database for Annotation, Visualization and Integrated Discovery) - http://david.abcc.ncifcrf.gov/
DAVID provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes.
Full course: https://creativedatasolutions.github.io/CDS.courses/courses/network_mapping_101/docs/
The course covered all of the steps required to go from `raw data` to a rich `mapped biochemical network` incorporating statistical, multivariate and machine learning results. This included [examples](https://creativedatasolutions.github.io/CDS.courses/courses/network_mapping_101/docs/#topics) and tutorials for:
* Preparing raw data for analysis
* Multivariate data exploration
* Supervised clustering
* Machine learning ā classification model validation and feature selection
* Network analysis - biochemical, structural similarity and correlation networks
* Network mapping ā putting it all together to create a publication quality network
url:
https://github.com/CreativeDataSolutions/CDS.courses/blob/gh-pages/courses/network_mapping_101/materials/lectures/tutorial.pdf
Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integratio...Dmitry Grapov
Ā
Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) applications. These methods have been shown capable of representing and learning predictable relationships in many diverse forms of data and hold promise for transforming the future of omics research and applications in precision medicine. Omics and electronic health record data pose considerable challenges for DL. This is due to many factors such as low signal to noise, analytical variance, and complex data integration requirements. However, DL models have already been shown capable of both improving the ease of data encoding and predictive model performance over alternative approaches. It may not be surprising that concepts encountered in DL share similarities with those observed in biological message relay systems such as gene, protein, and metabolite networks. This expert review examines the challenges and opportunities for DL at a systems and biological scale for a precision medicine readership.
current: https://drive.google.com/open?id=0B51AEMfo-fh9M3FmWXVlb05pdm8
I am always looking for the next data science, machine learning and visualization challenge.
Here is a link to my up to date
resume:
https://drive.google.com/open?id=0B51AEMfo-fh9M3FmWXVlb05pdm8
cv:
https://drive.google.com/open?id=0B51AEMfo-fh9Z05aM2p6XzFIOFE
Case Study: Overview of Metabolomic Data Normalization StrategiesDmitry Grapov
Ā
Five normalization methods were compared, of which the combination of qc-LOESS and cubic splines showed the best performance based on within-batch and between-batch variable relative standard deviations for QCs. This approach was used to normalize sample measurements the results of which were analyzed using principal components analysis.
3 data normalization (2014 lab tutorial)Dmitry Grapov
Ā
Get more information:
http://imdevsoftware.wordpress.com/2014/10/11/2014-metabolomic-data-analysis-and-visualization-workshop-and-tutorials/
Recently I had the pleasure of teaching statistical and multivariate data analysis and visualization at the annual Summer Sessions in Metabolomics 2014, organized by the NIH West Coast Metabolomics Center.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Ā
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
Ā
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
How to Make a Field invisible in Odoo 17Celine George
Ā
It is possible to hide or invisible some fields in odoo. Commonly using āinvisibleā attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Ā
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Embracing GenAI - A Strategic ImperativePeter Windle
Ā
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
Ā
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
2. Examples
Nature Reviews Genetics 15, 107ā120 (2014) doi:10.1038/nrg3643
FBA = flux-balance analysis
ā¢ Topological enrichment can give broad overview
of impacted genes, proteins and metabolites
ā¢ Changes in biochemical domains corroborated
by multi-Omic data sets can be used to identify
robust candidates responsible for phenotypic
variation between comparisons
ā¢ Gene-gene, protein-protein or gene-protein
interaction networks can be used to
deconvolute ambiguous metabolic pathways
4. Biochemical Domain
Enrichment Analysis
ā¢ Genes/Proteins ļ DAVID, AmiGo, etc ļ GO:terms
ā¢ Genes/Proteins + Metabolites ļ IMPaLA: Integrated Molecular
Pathway Level Analysis (http://impala.molgen.mpg.de/) ļ pathways
1. Classify all species domains (e.g. biological process, pathway, etc)
2. Calculate probability of observing changes in species by chance
5. IMPaLA: Gene + Metabolite
pathway enrichment
Challenges:
ā¢Removal of redundant information
ā¢Preference of specific vs. generic pathways
ā¢Visualization of gene + metabolite + pathway relationships
6. Determining significance of the
enrichment: Hypergeometric Test
How to calculate statistics to determine enrichment?
hit.num = 51 # number of significantly changed pathway
metabolites
set.num = 1455 # number of metabolites in pathway
full = 3358 # all possible metabolites in organism
q.size = 72 # number of significantly changed metabolites
phyper(hit.num-1, set.num, full-set.num, q.size, lower.tail=F)
= 1.717553e-06
7. GO Enrichment analysis:
Hierarchy of Redundancy (parents)
ā¢ GO is an ontology wherein enrichment is often
shared by children and parents.
ā¢ Difficult to co-visualize term hierarchy and gene to
term mapping
8. Enrichment networks:
Removing the Hierarchy of
Redundancy
Workflow:
1. If two nodes share all genes, drop least
enriched (highest p-value)
2. Filter terms based on enrichment
3. Display term to gene/protein
relationships as edges in a network
4. Map direction of change in
genes/proteins to network node
attributes
9. Enrichment Network
Mapping of parents through children
GO enrichment network displays:
ā¢ gene names associated with
each overrepresented term
ā¢ Fold change in protein
expression between two
groups (can be extended k>2
groups)
ā¢ Can display enrichment p-
value for each term
ā¢ Can incorporate metabolites
as children of genes
10. Empirical Networks
ā¢ Correlation based networks (CN)
(simple, tendency to hairball)
ā¢ GGM or partial correlation based
networks (advanced, preference
of direct over indirect
relationships
ā¢ *Increase in robustness with
sample size
10.1007/978-1-4614-1689-0_17