SlideShare a Scribd company logo
1 of 25
Pathway analysis of genomic data: concepts, methods, and prospects for future
development
1
By: Sakshi
Jha
Content
s
• Introduction
• What do we get out of pathway analysis
• Selecting an overall study design
• Fundamental concepts about biological pathways and networks
• Preparing data for association testing
• Analytical methods to detect pathway–phenotype relationships
• Post-analysis considerations
• Future developments in genomic data analysis
• Pathways and networks: bridging multi-omics data
• References
2
Introduction
• In the last decade, pathway analysis has become the first choice for extracting and
explaining the underlying biology for high throughput molecular measurements.
• Today, virtually every bioinformatics study looks for statistically significant pathways as
either biological interpretation or validation of computationally derived results.
• Genome-wide data sets are increasingly viewed as foundations for discovering pathways
and networks relevant to phenotypes.
3
What do we get out of pathway analysis?
• In-depth and contextualized findings to help
• understand the mechanisms of disease in question
• Identification of genes and proteins associated with the etiology of a specific disease
• Prediction of drug targets
• Understand how to intervene therapeutically in disease processes
• Conduct targeted literature searches. 4
5
Selecting an overall study design
6
Candidate Pathway Analysis is
hypothesis driven: pathways are preselected
based on prior knowledge and insight.
• Although the number of candidate pathways
may vary with study goals.
• Candidate pathway analysis most appropriate
where computational resources are limited and
where specific pathways are of a priori
interest.
GENOME WIDE PATHWAY ANALYSIS,
interrogates a complete genomic data set through
pathways representing an extensive range of
biology.
• Methods limited to GWPA have been used on
data sets with only1000 genes.
• This approach can more readily detect
unexpected relationships.
• GWPA is computationally intensive
• Requiring more stringent corrections for
multiple comparisons and making procedures,
such as imputation, more challenging.
7
Fundamental concepts about biological pathways and networks
8
Obtaining input genomic and pathway
annotation data
9
o Pathway analyses can utilize raw genotype data for individual subjects or a list of p-values relating
genes or SNPs To a phenotype.
o In parallel, a pathway analysis is only as good as the functional information underlying its pathway
definitions.
o Investigators should consider their resources and study goals when selecting the most appropriate
genomic data source.
o Prominent pathway annotation databases exhibit diverse features.
o The ideal choice of database depends on several variables and their impact on study goals
10
PREPARING DATA FOR
ASSOCIATION TESTING
11
Pathway size
• Most pathway analyses place constraints on pathway size:
• Small pathways can exhibit false positive associations because of large single-gene or single-SNP
effects, whereas large pathways are more likely to show association by chance alone.
• The most common minimum threshold for pathway size appears to be 10 genes.
• Frequently used maximum thresholds for pathway size include 100 and 200 genes.
• Often, this threshold may exclude highly specific and potentially informative functional sets, including
those involving protein complexes and DNA sequence motifs.
• Overall, investigators should consider their study goals when applying such thresholds and should
evaluate results in that context.
12
Pathway overlap
• Genes and their products typically act in multiple pathways, and each role is potentially
important to a disease or treatment mechanism. As a result, analyses can expect to have some
degree of pathway overlap.
• High pathway overlap can obscure the true source of an association signal; this problem can
exist with any pathway analysis, Gene Ontology (GO) annotations are particularly susceptible
because of the large, hierarchical structure of the database.
• Pathway overlap can also be addressed during post-analysis to prioritize related pathways for
further exploration.
• Extant strategies include hierarchical clustering in a study of breast cancer, overlap-based
network creation in the visualization tool ‘Enrichment Map’ and the listing of overlapping
pathways alongside results in the analytical software PARIS.
13
Assigning data elements to genes
• Genomic data have historically been integrated into pathways by mapping assayed elements to genes.
• SNP based genotyping arrays, this is not straightforward because many array SNPs are not located in known
coding or regulatory regions.
• Alternatively, each unmapped SNP can be assigned to its nearest gene.
• Evolving theories suggest that sequences are not associated to genes based on closest proximity, and may not
even be solely associated to one gene.
• Hence, many studies assign unmapped SNPs to all genes within a distance window, ranging from 10 kb to 500
kb.
• SNPs that map to multiple genes in the same pathway can yield spurious pathway association. This issue is
particularly important for MHC/HLA genes.
14
Calculating gene significance and
accounting for LD
• Most pathway analysis tools utilize one association signal per gene.
• Whereas expression arrays yield a single P-value for each gene, SNP arrays include multiple
signals per gene, some of which are correlated.
• some studies use the minimum SNP-level P-value within a gene as the operative signal;
• however, this approach will not detect additive effects among SNPs with moderate individual
association.
• For methods that combine SNP level signals, including those based on the truncated product
method , LD must be accounted for to prevent highly correlated SNPs from biasing gene-
level significance.
15
• Strategies to accomplish this include discarding SNPs that depart from LD at a pre-set threshold
and adapting principal component analysis to extract the most independent signals within a gene.
16
Analytical methods to detect pathway–phenotype relationships
• Attempting to better understand the molecular basis of a phenotype, studies often identify the
pathways to which the genes that may be linked to the SNPs belong.
• However, to establish the association of the phenotype with a pathway, one needs to show that
phenotype associated SNPs tend to fall within that pathway more than expected by chance.
• Following data processing, analytical methods can be applied to test for significant pathway–
phenotype relationships. Prominent examples of pathway-based analytical tools and their salient
features are shown: 17
18
Post-analysis considerations
• Following pathway analysis, appropriate data reporting and interpretation are imperative.
• Currently, bias introduced by gene size is less commonly addressed than is bias from pathway size.
• In particular, large genes containing many SNPs are more likely to contain significant SNPs by chance
alone; for analyses, this can favor pathways containing large genes.
• Analytical tools that use permutations naturally control for gene size by comparing the actual
association data with the distribution of association statistics generated from randomly permuted data
sets.
• Other approaches allow users to restrict analysis to a subset of the most significant SNPs in each gene:
for large genes, this may eliminate some spuriously associated SNPs and thus limit their impact on the
pathway analysis.
19
• Other sources of bias that should be addressed include the capacity for strongly associated markers to
drive pathway association and the possible effects of SNPs being assigned to multiple genes.
• Fundamentally, these approaches to bias are best complemented by replication of pathway analysis
findings in independent data sets.
• key choices in pathway analyses will limit major contributors of variance across studies and will guide
investigators in selecting approaches that fit their study goals.
20
Future developments in genomic data analysis
• Development of methods and tools related to pathway analysis is ongoing and dynamic.
• In particular, because pathways are of broad interest, targeted adaptations to their associated databases
would expand their utility for investigators from a variety of backgrounds.
• These adaptations might include simpler search and download mechanisms, consistency in pathway
names and classifications, and methods for describing pathway overlap.
• In addition, a universal format for annotation files might encourage interoperability among analytical
tools, allowing investigators flexibility to match precisely their databases and statistical methods of
choice.
• Two recent trends among databases are also promising Specialized disease databases, such as
AlzGene and the UCSC Cancer Genomics Browser.
21
• Meanwhile, complementary methods can extend the biological reach of pathway-
based results.
• pathway analysis platforms, computational efficiency will be highly valued, given
the impressive granularity of next-generation sequencing data.
22
PATHWAYS AND NETWORKS: BRIDGING MULTI-OMICS DATA
• In the coming years, we anticipate that pathways and network will assume a farther-reaching role in
view of the need to integrate multi-omics data through systems biology approaches.
• A variety of large-scale strategies are being used to study complex diseases, including genomic,
transcriptomic, proteomic and metabolomic approaches, and data from all of these sources can be
analyzed through pathways and networks representing coordinated functions and relationships.
• As such, the role of pathways and networks as the hub for this integration will be vital in the years
to come.
23
Reference
• (Ramanan, Shen et al. 2012)
Ramanan,V. K., L. Shen, J. H. Moore and A. J. Saykin (2012). "Pathway analysis of genomic data: concepts,
methods, and prospects for future development."TRENDS in Genetics 28(7): 323-332.
https://doi.org/10.1016/j.tig.2012.03.004
24
25
Thank
You

More Related Content

What's hot

Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditSC CTSI at USC and CHLA
 
Advances in computer aided drug design
Advances in computer aided drug designAdvances in computer aided drug design
Advances in computer aided drug designVikas Soni
 
Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...SC CTSI at USC and CHLA
 
Sbm open science committee report to the board
Sbm open science committee report to the boardSbm open science committee report to the board
Sbm open science committee report to the boardBradford Hesse
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi
 
Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Robin Featherstone
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-finalPeter Embi
 

What's hot (8)

Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
 
Advances in computer aided drug design
Advances in computer aided drug designAdvances in computer aided drug design
Advances in computer aided drug design
 
Effectiveness of New, Informationist-led Curriculum Changes at the College of...
Effectiveness of New, Informationist-led Curriculum Changes at the College of...Effectiveness of New, Informationist-led Curriculum Changes at the College of...
Effectiveness of New, Informationist-led Curriculum Changes at the College of...
 
Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...
 
Sbm open science committee report to the board
Sbm open science committee report to the boardSbm open science committee report to the board
Sbm open science committee report to the board
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-Review
 
Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Searching with point of care tools sept 2010
Searching with point of care tools sept 2010
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 

Similar to Pathway analysis for genomics data

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicshemantbreeder
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSOYEON KIM
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Amia tb-review-12
Amia tb-review-12Amia tb-review-12
Amia tb-review-12Russ Altman
 
Role of genomics and proteomics
Role of genomics and proteomicsRole of genomics and proteomics
Role of genomics and proteomicsPavana K A
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategiesAshfaq Ahmad
 
Amia tb-review-10
Amia tb-review-10Amia tb-review-10
Amia tb-review-10Russ Altman
 
Pizza club - March 2017 - Gaia
Pizza club - March 2017 - GaiaPizza club - March 2017 - Gaia
Pizza club - March 2017 - GaiaRSG Luxembourg
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-finalRuss Altman
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010ygc
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
Genome wide Association studies.pptx
Genome wide Association studies.pptxGenome wide Association studies.pptx
Genome wide Association studies.pptxAkshitaAwasthi3
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomicsJalormi Parekh
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Dinesh Barupal
 
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_JanedsadddadsadsadsadsadasdsdasdaAdiM27
 
Next Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicNext Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicEdizonJambormias2
 

Similar to Pathway analysis for genomics data (20)

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traits
 
Data mining ppt
Data mining pptData mining ppt
Data mining ppt
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Amia tb-review-12
Amia tb-review-12Amia tb-review-12
Amia tb-review-12
 
Role of genomics and proteomics
Role of genomics and proteomicsRole of genomics and proteomics
Role of genomics and proteomics
 
Whole Exome Sequencing .pptx
Whole Exome Sequencing .pptxWhole Exome Sequencing .pptx
Whole Exome Sequencing .pptx
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategies
 
MSKCC Publish JSW
MSKCC Publish JSWMSKCC Publish JSW
MSKCC Publish JSW
 
Amia tb-review-10
Amia tb-review-10Amia tb-review-10
Amia tb-review-10
 
Pizza club - March 2017 - Gaia
Pizza club - March 2017 - GaiaPizza club - March 2017 - Gaia
Pizza club - March 2017 - Gaia
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-final
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Genome wide Association studies.pptx
Genome wide Association studies.pptxGenome wide Association studies.pptx
Genome wide Association studies.pptx
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomics
 
KnetMiner - EBI Workshop 2017
KnetMiner - EBI Workshop 2017KnetMiner - EBI Workshop 2017
KnetMiner - EBI Workshop 2017
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases
 
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
 
Next Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicNext Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the Clinic
 

Recently uploaded

FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxPABOLU TEJASREE
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 

Recently uploaded (20)

FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 

Pathway analysis for genomics data

  • 1. Pathway analysis of genomic data: concepts, methods, and prospects for future development 1 By: Sakshi Jha
  • 2. Content s • Introduction • What do we get out of pathway analysis • Selecting an overall study design • Fundamental concepts about biological pathways and networks • Preparing data for association testing • Analytical methods to detect pathway–phenotype relationships • Post-analysis considerations • Future developments in genomic data analysis • Pathways and networks: bridging multi-omics data • References 2
  • 3. Introduction • In the last decade, pathway analysis has become the first choice for extracting and explaining the underlying biology for high throughput molecular measurements. • Today, virtually every bioinformatics study looks for statistically significant pathways as either biological interpretation or validation of computationally derived results. • Genome-wide data sets are increasingly viewed as foundations for discovering pathways and networks relevant to phenotypes. 3
  • 4. What do we get out of pathway analysis? • In-depth and contextualized findings to help • understand the mechanisms of disease in question • Identification of genes and proteins associated with the etiology of a specific disease • Prediction of drug targets • Understand how to intervene therapeutically in disease processes • Conduct targeted literature searches. 4
  • 5. 5
  • 6. Selecting an overall study design 6 Candidate Pathway Analysis is hypothesis driven: pathways are preselected based on prior knowledge and insight. • Although the number of candidate pathways may vary with study goals. • Candidate pathway analysis most appropriate where computational resources are limited and where specific pathways are of a priori interest. GENOME WIDE PATHWAY ANALYSIS, interrogates a complete genomic data set through pathways representing an extensive range of biology. • Methods limited to GWPA have been used on data sets with only1000 genes. • This approach can more readily detect unexpected relationships. • GWPA is computationally intensive • Requiring more stringent corrections for multiple comparisons and making procedures, such as imputation, more challenging.
  • 7. 7
  • 8. Fundamental concepts about biological pathways and networks 8
  • 9. Obtaining input genomic and pathway annotation data 9 o Pathway analyses can utilize raw genotype data for individual subjects or a list of p-values relating genes or SNPs To a phenotype. o In parallel, a pathway analysis is only as good as the functional information underlying its pathway definitions. o Investigators should consider their resources and study goals when selecting the most appropriate genomic data source. o Prominent pathway annotation databases exhibit diverse features. o The ideal choice of database depends on several variables and their impact on study goals
  • 10. 10
  • 12. Pathway size • Most pathway analyses place constraints on pathway size: • Small pathways can exhibit false positive associations because of large single-gene or single-SNP effects, whereas large pathways are more likely to show association by chance alone. • The most common minimum threshold for pathway size appears to be 10 genes. • Frequently used maximum thresholds for pathway size include 100 and 200 genes. • Often, this threshold may exclude highly specific and potentially informative functional sets, including those involving protein complexes and DNA sequence motifs. • Overall, investigators should consider their study goals when applying such thresholds and should evaluate results in that context. 12
  • 13. Pathway overlap • Genes and their products typically act in multiple pathways, and each role is potentially important to a disease or treatment mechanism. As a result, analyses can expect to have some degree of pathway overlap. • High pathway overlap can obscure the true source of an association signal; this problem can exist with any pathway analysis, Gene Ontology (GO) annotations are particularly susceptible because of the large, hierarchical structure of the database. • Pathway overlap can also be addressed during post-analysis to prioritize related pathways for further exploration. • Extant strategies include hierarchical clustering in a study of breast cancer, overlap-based network creation in the visualization tool ‘Enrichment Map’ and the listing of overlapping pathways alongside results in the analytical software PARIS. 13
  • 14. Assigning data elements to genes • Genomic data have historically been integrated into pathways by mapping assayed elements to genes. • SNP based genotyping arrays, this is not straightforward because many array SNPs are not located in known coding or regulatory regions. • Alternatively, each unmapped SNP can be assigned to its nearest gene. • Evolving theories suggest that sequences are not associated to genes based on closest proximity, and may not even be solely associated to one gene. • Hence, many studies assign unmapped SNPs to all genes within a distance window, ranging from 10 kb to 500 kb. • SNPs that map to multiple genes in the same pathway can yield spurious pathway association. This issue is particularly important for MHC/HLA genes. 14
  • 15. Calculating gene significance and accounting for LD • Most pathway analysis tools utilize one association signal per gene. • Whereas expression arrays yield a single P-value for each gene, SNP arrays include multiple signals per gene, some of which are correlated. • some studies use the minimum SNP-level P-value within a gene as the operative signal; • however, this approach will not detect additive effects among SNPs with moderate individual association. • For methods that combine SNP level signals, including those based on the truncated product method , LD must be accounted for to prevent highly correlated SNPs from biasing gene- level significance. 15
  • 16. • Strategies to accomplish this include discarding SNPs that depart from LD at a pre-set threshold and adapting principal component analysis to extract the most independent signals within a gene. 16
  • 17. Analytical methods to detect pathway–phenotype relationships • Attempting to better understand the molecular basis of a phenotype, studies often identify the pathways to which the genes that may be linked to the SNPs belong. • However, to establish the association of the phenotype with a pathway, one needs to show that phenotype associated SNPs tend to fall within that pathway more than expected by chance. • Following data processing, analytical methods can be applied to test for significant pathway– phenotype relationships. Prominent examples of pathway-based analytical tools and their salient features are shown: 17
  • 18. 18
  • 19. Post-analysis considerations • Following pathway analysis, appropriate data reporting and interpretation are imperative. • Currently, bias introduced by gene size is less commonly addressed than is bias from pathway size. • In particular, large genes containing many SNPs are more likely to contain significant SNPs by chance alone; for analyses, this can favor pathways containing large genes. • Analytical tools that use permutations naturally control for gene size by comparing the actual association data with the distribution of association statistics generated from randomly permuted data sets. • Other approaches allow users to restrict analysis to a subset of the most significant SNPs in each gene: for large genes, this may eliminate some spuriously associated SNPs and thus limit their impact on the pathway analysis. 19
  • 20. • Other sources of bias that should be addressed include the capacity for strongly associated markers to drive pathway association and the possible effects of SNPs being assigned to multiple genes. • Fundamentally, these approaches to bias are best complemented by replication of pathway analysis findings in independent data sets. • key choices in pathway analyses will limit major contributors of variance across studies and will guide investigators in selecting approaches that fit their study goals. 20
  • 21. Future developments in genomic data analysis • Development of methods and tools related to pathway analysis is ongoing and dynamic. • In particular, because pathways are of broad interest, targeted adaptations to their associated databases would expand their utility for investigators from a variety of backgrounds. • These adaptations might include simpler search and download mechanisms, consistency in pathway names and classifications, and methods for describing pathway overlap. • In addition, a universal format for annotation files might encourage interoperability among analytical tools, allowing investigators flexibility to match precisely their databases and statistical methods of choice. • Two recent trends among databases are also promising Specialized disease databases, such as AlzGene and the UCSC Cancer Genomics Browser. 21
  • 22. • Meanwhile, complementary methods can extend the biological reach of pathway- based results. • pathway analysis platforms, computational efficiency will be highly valued, given the impressive granularity of next-generation sequencing data. 22
  • 23. PATHWAYS AND NETWORKS: BRIDGING MULTI-OMICS DATA • In the coming years, we anticipate that pathways and network will assume a farther-reaching role in view of the need to integrate multi-omics data through systems biology approaches. • A variety of large-scale strategies are being used to study complex diseases, including genomic, transcriptomic, proteomic and metabolomic approaches, and data from all of these sources can be analyzed through pathways and networks representing coordinated functions and relationships. • As such, the role of pathways and networks as the hub for this integration will be vital in the years to come. 23
  • 24. Reference • (Ramanan, Shen et al. 2012) Ramanan,V. K., L. Shen, J. H. Moore and A. J. Saykin (2012). "Pathway analysis of genomic data: concepts, methods, and prospects for future development."TRENDS in Genetics 28(7): 323-332. https://doi.org/10.1016/j.tig.2012.03.004 24