Phenotype rcn so-geno_workshop(shared)

745 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
745
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Phenotype rcn so-geno_workshop(shared)

  1. 1. SO-Monarch Workshop Toward the Integration of Genetic Variation Modeling Sept 16-18, 2013 Oregon Health and Science University Portland, OR Funded by the Phenotype RCN MONARCH INITIATIVE Matthew Brush Melissa Haendel Chris Mungall Mike Bada Karen Eilbeck Bret Heale A rough transcript of audio from this presentation can be found here: https://docs.google.com/document/d/13oifUZeWxK5hXPlMW6pl3BXoxr6xTossUT4_Fl2cgg/edit
  2. 2. The Sequence Ontology  An OBO Library ontology developed to standardize the vocabulary and semantics of biological sequence annotation  Use has expanded from genome annotation into new applications - variation description, text mining, experimental data annotation  Undergoing significant refactoring to meet new needs: - align with BFO - enhance variation representation - develop parallel representation of physical sequence entities - SO vs MSO (molecular sequence ontology) - improve explicit representations across central dogma
  3. 3. Monarch Initiative The Monarch Initiative The Monarch Initiative aims to bring G2P and related data together under a common semantic framework and develop tools and services for user-guided exploration and analysis Data integration and application functionality driven by a suite of ontologies that include many community resources as well as new ontologies (GENO)
  4. 4. GENO: A Genotype Ontology  The core use case of GENO for Monarch is to support aggregation and semantic integration of genotype data and its link to phenotypes across these diverse sources.  GENO is an ontology of 'genotype' sequence information that describes types and scales of genetic variations associated with phenotypes, and places these variations in a broader biological context.  Genotype information in GENO is viewed broadly to include any variation in gene expression that is tied to an observed phenotypic effect. We distinguish two types of variation: environment SEQUENCE-VARIATION AGACTACTACGTAGGTCCTCC Arg-Leu-Leu-Arg-Stop PHENOTYPE ‘short fin’
  5. 5. GENO: A Genotype Ontology  The core use case of GENO for Monarch is to support aggregation and semantic integration of genotype data and its link to phenotypes across these diverse sources.  GENO is an ontology of 'genotype' sequence information that describes types and scales of genetic variations associated with phenotypes, and places these variations in a broader biological context.  Genotype information in GENO is viewed broadly to include any variation in gene expression that is tied to an observed phenotypic effect. We distinguish two types of variation: morpholinos EXPRESSION-VARIATION PHENOTYPE environment AGACTACTACGTACGTCCTCC X Arg-Leu-Leu-Arg-Thr-Ser-Ser ‘short fin’
  6. 6. GENO: A Genotype Ontology  The core use case of GENO for Monarch is to support aggregation and semantic integration of genotype data and its link to phenotypes across these diverse sources.  GENO is an ontology of 'genotype' sequence information that describes types and scales of genetic variations associated with phenotypes, and places these variations in a broader biological context.  Genotype information in GENO is viewed broadly to include any variation in gene expression that is tied to an observed phenotypic effect. We distinguish two types of variation: SEQUENCE-VARIATION EXPRESSION-VARIATION ‘Intrinsic’ Genotype ‘Extrinsic’ Genotype apchu745/+; fgf8ti282/ti282(AB) shhbMO1-shhb(2ng); ihhbMO2-ihhb(1ng) Together these artifacts can capture the complement of all genetic variation in an organism in terms of the loci that are altered in their sequence or their expression level
  7. 7. Decomposition of an ‘Intrinsic’ Genotype Intrinsic Genotype – specifies a sequence variation across an entire genome in terms of its differences from some reference genome AACGTAGCGACGCTCGCTACGGGCGTATC AACGTAGCGACGCTCGCTACGGGCGTATC X = AACGTACCGACGCTCGCTACGGGCGTATC X GCGAAGTGCCAACTTCTACACACACAAAG CGTAGC AACGTACCGACGCTCGCTACGGGCGTATC CGTACC GCGAAGTGCCAACTTCTACACACACAAAG X GCGAAGTGCCAACTTCTACACACACAAAG + X X ACAC X ACAC GCGAAGTGCCAACTTCTACACACACAAAG genotype genomic background genomic variation complement apchu745/+; fgf8ti282/ti282(AB) (AB) apchu745/+; fgf8ati282/ti282 X X X X genomic variation complement has_part variant single locus complement has_part X variant locus (aka allele) has_part X sequence alteration apchu745/+; fgf8ati282/ti282 has_part apchu745/+ has_part apchu745 has_part hu745
  8. 8. Decomposition of an ‘Extrinsic’ Genotype ‘Extrinsic genotypes’ describe sequences subject to transient variation in expression at the time of an experiment Morpholino-mediated gene knockdown An extrinsic genotype is comprised of the collection of all genes in the organism that are variant in their expression as a result of some experimental intervention
  9. 9. ‘Effective’ Genotypes
  10. 10. Workshop Motivation  Both GENO and SO are developed to cover different perspectives on the domain of abstract biological sequence information.  GENO is new and developing. SO is mature but undergoing major refactoring.  Primary workshop goal was to ensure models are interoperable to allow integration of data described using SO and GENO.  Our work fell into three categories: o Ontology o Community o Logistics/Planning
  11. 11. Workshop Goals and Outcomes Ontology 1. Ontological Debate: to align high-level ontological modeling of sequence features and intrinsic and extrinsic variation 2. Core terminological standardization: Establish clear definitions and usage of core domain terms (gene, allele, variant, etc) . . .in progress 3. SO vs MSO: Developed strategy for parallel SO and MSO development and maintenance . . in progress 4. Conceptual Integration of SO and GENO: Intrinsic genotype modeling in scope of SO, but extrinsic modeling is not and will live exclusively in GENO
  12. 12. Workshop Goals and Outcomes Community 1. Gene representation: strategy to provide an ontological representation and identifiers for genes and their variants as a community resource for diverse applications 2. Modeling the central dogma: build from gene representation to describe relations to sequences at RNA and protein levels, and properties that emerge here 3. Phenotype annotation practices: develop a standard for use of phenotype ontologies for GVF file annotation • http://www.sequenceontology.org/so_wiki/index.php/Using_Phenotype_Ontologies_in_GVF 4. GENO as a community ontology: plan for separating monarch specific features from a more generally useful community model
  13. 13. Workshop Goals and Outcomes Planning and Logistics Technical Integration Plan: decide how GENO and SO will interact at technical level (namespaces, imports, mappings, etc) Collaborative Development Plan: establish framework of tools and practices for parallel development of SO and Monarch ontologies Data and Use Case Plan: to collect use cases and build test data sets from the community to inform and test our modeling Continued Working Group : weekly Tuesday afternoon calls, open to community
  14. 14. Thank You Thanks to the Phenotype RCN for their support! Details about workshop outcomes can be found here: https://docs.google.com/document/d/1AUEVX0Sx_iy9mTI6F59Yo7ZCXu4zv5uSk28AHid5zhc/edit# Questions?

×