An Ontological Characterization for the
Integration of Genetic Variation Data
WHAT’S IN A GENOTYPE?
Matthew H. Brush,
Chri...
Genotype-to-Phenotype Research
B6.Cg-Alms1foz/fox/J
increased weight,
adipose tissue volume,
glucose homeostasis altered
A...
Integrating G2P Data
Integrating G2P Data
The Monarch Initiative
The Monarch Initiative aims to bring G2P and related data
together under a com...
Integration Challenges
I. Reconciling G2P data annotated to different
‘levels’ of a genotype
II. Integrating ‘non-genomic’...
GCGAAGTGCCAACTTCTACACACACAAAG
GCGAAGTGCCAACTTCTACACACACAAAG
Decomposition of a Genotype
genotype
genomic variation
complem...
I. Reconciling Levels of G2P Association
apchu745/+; fgf8ati282/ti282(AB)
increased cell proliferation
disrupted digestive...
allele: apchu745
gene: apc fgf8a
allele: c.937_938delGA
gene: apc
(PHENOTYPE
PROPAGATION)
I. Reconciling Levels of G2P Ass...
Property chains exploit the transitive genotype
partonomy to infer phenotype associations
[variant] is_variant_part_of gen...
Example of Phenotype Propagation
has_phenotype
apchu745/+;fgf8ati282/ti282(AB)
cell proliferation,
digestive tract develop...
Example of Phenotype Propagation
apchu745,
fgf8ati282
hu745
ti282
has_variant_part
has_variant_part
has_variant_part
has_v...
Example of Phenotype Propagation
1. Monarch ingests
phenotypes annotated
to a genotype
2. Genotype is parsed to
create ins...
II. Integrating Non-Genomic Variation
‘Extrinsic genotypes’ describe
sequences subject to transient
variations in expressi...
III. Semantic Links to Related Data
GENO In the OBO Foundry
• GENO modeled according to OBO Foundry principles, under
conceptual frameworks of the BFO, IAO, a...
Summary and Future Directions
GENO in the Monarch Data Integration Pipeline
1. Raw data ingested into Monarch RDB
2. Views...
Acknowledgements
OHSU
Melissa Haendel
Carlo Torniai
Shahim Essaid
Nicole Vasilevsky
Scott Hoffman
LBNL
Chris Mungall
Suzi ...
Upcoming SlideShare
Loading in …5
×

What's In a Genotype?: An Ontological Characterization for the Integration of Genetic Variation Data

906 views
583 views

Published on

ICBO 2013 Presentation

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
906
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

What's In a Genotype?: An Ontological Characterization for the Integration of Genetic Variation Data

  1. 1. An Ontological Characterization for the Integration of Genetic Variation Data WHAT’S IN A GENOTYPE? Matthew H. Brush, Chris Mungall, Nicole Washington, and Melissa Haendel Oregon Health and Science University, Lawrence Berkeley Labs International Conference in Biomedical Ontology July 8, 2013
  2. 2. Genotype-to-Phenotype Research B6.Cg-Alms1foz/fox/J increased weight, adipose tissue volume, glucose homeostasis altered ALSM1(NM_015120.4) [c.10775delC] + [-] GENOTYPE PHENOTYPE obesity, diabetes mellitus, insulin resistance increased food intake, hyperglycemia, insulin resistance kcnj11c14/c14; insrt143/+(AB) G2P research seeks a mechanistic understanding of how genetic variation is linked to organismal biology and disease
  3. 3. Integrating G2P Data
  4. 4. Integrating G2P Data The Monarch Initiative The Monarch Initiative aims to bring G2P and related data together under a common semantic framework to support integrated exploration and analysis.
  5. 5. Integration Challenges I. Reconciling G2P data annotated to different ‘levels’ of a genotype II. Integrating ‘non-genomic’ forms of variation III. Creating semantic links to biological data Technical Challenges  Terminological, syntactic, organizational variation in data is common Knowledge-Based Challenges  Reflect inherent complexity in the way G2P data is generated and what it represents
  6. 6. GCGAAGTGCCAACTTCTACACACACAAAG GCGAAGTGCCAACTTCTACACACACAAAG Decomposition of a Genotype genotype genomic variation complementgenomic background = + CGTAGC CGTACC apchu745/+; fgf8ati282/ti282(AB) genomic variation complement variant single locus complement variant locus (allele) sequence alteration has_part has_part apchu745/+ apchu745 hu745 has_part has_part has_part has_part X AACGTACCGACGCTCGCTACGGGCGTATC (AB) apchu745/+; fgf8ati282/ti282 apchu745/+; fgf8ati282/ti282 GCGAAGTGCCAACTTCTACACACACAAAG GCGAAGTGCCAACTTCTACACACACAAAG AACGTAGCGACGCTCGCTACGGGCGTATC AACGTACCGACGCTCGCTACGGGCGTATC X ACAC X X X X Genotype – an information entity that specifies an entire genome sequence in terms of its variation from some reference genome AACGTAGCGACGCTCGCTACGGGCGTATC X ACAC X X X X X
  7. 7. I. Reconciling Levels of G2P Association apchu745/+; fgf8ati282/ti282(AB) increased cell proliferation disrupted digestive tract development gut deformation APC (NM_000038.5) c.937_938delGA X Phenotype AllelePhenotype Genome CGTACCG GCGAAGTGCCAACTTCTACACACACAAAG GCGAAGTGCCAACTTCTACACACACAAAG X AACGTACCGACGCTCGCTACGGGCGTATC AACGTAGCGACGCTCGCTACGGGCGTATC X X intestinal polyps abnormal retinal pigmentation sebaceous cysts
  8. 8. allele: apchu745 gene: apc fgf8a allele: c.937_938delGA gene: apc (PHENOTYPE PROPAGATION) I. Reconciling Levels of G2P Association inferred apchu745/+; fgf8ati282/ti282(AB) increased cell proliferation disrupted digestive tract development gut deformation APC (NM_000038.5) c.937_938delGA X Phenotype Genome CGTACCG GCGAAGTGCCAACTTCTACACACACAAAG GCGAAGTGCCAACTTCTACACACACAAAG X AACGTACCGACGCTCGCTACGGGCGTATC AACGTAGCGACGCTCGCTACGGGCGTATC X X intestinal polyps abnormal retinal pigmentation sebaceous cysts Phenotype Allele
  9. 9. Property chains exploit the transitive genotype partonomy to infer phenotype associations [variant] is_variant_part_of genotype genotype has_phenotype phenotype Atomic Relations Composed Relation is_variant_part_of o has_phenotype --> is_variant_with_phenotype Implementation of Phenotype Propagation
  10. 10. Example of Phenotype Propagation has_phenotype apchu745/+;fgf8ati282/ti282(AB) cell proliferation, digestive tract development gut deformation 1. Monarch ingests phenotypes annotated to a genotype genotype
  11. 11. Example of Phenotype Propagation apchu745, fgf8ati282 hu745 ti282 has_variant_part has_variant_part has_variant_part has_variant_part apchu745/+;fgf8ati282/ti282(AB) apchu745/+;fgf8ati282/ti282 apchu745/+ , fgf8ati282/ti282 cell proliferation, digestive tract development gut deformation apc fgf8a 1. Monarch ingests phenotypes annotated to a genotype 2. Genotype is parsed to create instances down partonomy Alleles GVC VSLCs Seq. Alts Genes has_phenotype
  12. 12. Example of Phenotype Propagation 1. Monarch ingests phenotypes annotated to a genotype 2. Genotype is parsed to create instances down partonomy 3. Phenotype propagation infers associations between phenotypes and each level in the partonomy apchu745, fgf8ati282 hu745 ti282 apc fgf8a has_variant_part has_variant_part has_variant_part has_variant_part apchu745/+;fgf8ati282/ti282(AB) apchu745/+;fgf8ati282/ti282 apchu745/+ , fgf8ati282/ti282 cell proliferation, digestive tract development gut deformation Alleles GVC VSLCs Seq. Alts Genes has_phenotype is_variant_ with_ phenotype
  13. 13. II. Integrating Non-Genomic Variation ‘Extrinsic genotypes’ describe sequences subject to transient variations in expression at the time of an experiment Representing extrinsic variation data in terms of the targeted genes facilitates integration with ‘intrinsic’ G2P data Morpholino-mediated gene knockdown ;
  14. 14. III. Semantic Links to Related Data
  15. 15. GENO In the OBO Foundry • GENO modeled according to OBO Foundry principles, under conceptual frameworks of the BFO, IAO, and SO • Collaborators in SO refactoring to enhance genetic variation representation, and ensure integration of Monarch data with SO-annotated genomes
  16. 16. Summary and Future Directions GENO in the Monarch Data Integration Pipeline 1. Raw data ingested into Monarch RDB 2. Views generated that contain “GENO-enhanced” data (standardized syntax, unpacked genotypes, links to external data) 3. D2RQ maps relational data to GENO and generates RDF 4. GENO-supported reasoning adds inferred G2P associations (e.g. phenotype propagation) Future Directions 1. Modeling of transgenes, human variation, and related data types 2. Develop property chains and algorithms to improve specificity and weighting of inferred G2P associations 3. Separate application features to provide a community model for public release and integration with SO
  17. 17. Acknowledgements OHSU Melissa Haendel Carlo Torniai Shahim Essaid Nicole Vasilevsky Scott Hoffman LBNL Chris Mungall Suzi Lewis Nicole Washington UCSD/NIF Maryann Martone Anita Bandrowski Jeff Grethe Amarnath Gupta Trish Whetzel University of Pittsburgh Harry Hochheiser Chuck Borromeo Monarch Initiative / NIF Sequence Ontology University of Utah Karen Eilbeck University of Colorado Mike Bada Funding NIH # 1R24OD011883-01 We are under construction OHSU Ontology Development Group www.ohsu.edu/library/ontology GENO ontology purl.obolibrary.org/obo/geno.owl

×