SlideShare a Scribd company logo
1 of 24
Download to read offline
Cross
Product
Extensions
to
the

        Gene
Ontology

           Chris
Mungall

     Gene
Ontology
Consor8um

    h:p://www.geneontology.org

Outline

•  What
the
Gene
Ontology
is
used
for

   –  GO
structure

   –  Limita8ons
of
text
defini8ons

•  Cross‐product
extensions
to
the
GO

   –  Logical
computable
defini8ons

•  Results
and
Examples

   –  Chemical
en88es,
proteins,
cells

   –  Anatomy
and
development

   –  Rela8ons

   –  Reasoning

•  Release
Plan

•  Conclusions

A
brief
introduc8on
to
the
GO

•  Nearing
11th
birthday

•  3
ontologies,
28k
classes

    –  Molecular
Func8on
(MF)

    –  Biological
Process
(BP)

    –  Cellular
Component
(CC)

•  Annota8ons

    –  42m
statements
assigning
func8on
or
localiza8on
to
genes
across
187k
species


•  Standard
uses
of
GO
annota8on:

    –  Naviga8ng
and
querying
func8onal
annota8ons
for
genes

    –  Discovery;
term
enrichment;
seman8c
similarity

    –  >50
tools
for
performing
hi‐throughput
analysis
using
GO

•  Most
uses
require
a
simple,
lightly
axioma8zed
graph

    –  is_a

    –  part_of

    –  Defini8ons
are
textual

Problems
and
limita8ons


•  Maintenance
and

   errors

  – Combinatorial
terms

  – Tangled

    polyhierarchies

•  Denormalized

  – Redundancy


  – lack
of
reuse

Solu8on:
normaliza8on
+
reasoning

•  Prior
work
                 metabolism
                   sulfur
                                                           amino acid
   –  Rector
et
al
                                   x

   –  Hill
et
al
              biosynthesis
                                                                cysteine


•  Retrospec.ve
                                 sulfur amino
   normaliza8on
                               acid metabolism
   –  GO
preceded
OBO

•  How?
                               cysteine
                                                            sulfur amino
                                                                 acid
   –  GONG,
Wroe
et
al
   =
          metabolism
           biosynthesis
   –  Ogren
et
al

   –  Obol

                                                  cysteine
                                                biosynthesis
Assigning
logical
defini8ons
to
GO

                  classes

•  Logical
defini8on
structure

    –  An
X
is
a
G
that
D

        •  X
:
defined
term

        •  G
:
genus
(parent)
term

        •  D
:
differen8a(e)
–
discrimina8ng
rela8onships

    –  Necessary
and
sufficient
condi8ons

    –  Computable
defini6on
should
mirror
text
defini6on

•  Simple
formalism,
limited
expressivity

    –  Equivalence
axioms
between
named
classes
and
posi8ve
conjunc8ons

       of
named
class
and
one
or
more
existen8al
restric8ons

        •  OBO
priniciple
of
Posi.vity

    –  General
template:

        •  EquivalentClasses(NamedClass
intersec8onOf(NamedGenus

           [someValuesFrom(NamedObjectProperty
NamedDifferen.aClass)]+))

Example:
mitochondrial
transla8on

•  ‘mitochondrial
transla8on’
=def
‘transla8on’
that

   occurs_in
‘mitochondrion’

       – (current
rela8onships
in
GO
are
necessary
condi8ons

         only)


OBO
          id: GO:0032543
              name: mitochondrial translation
              intersection_of: GO:0006412 ! translation
              intersection_of: occurs_in GO:0005739 ! mitochondrion


FOL
          X
instance_of
‘mitochondrial
transla8on’
<‐>

              

X
instance_of
transla8on
&

              


exists
C,t
[
C
instance_of
mitochondrion
at
t
&
X
occurs_in
C
at
t
]

OWL
          Class:
‘mitochondrial
transla8on’

manchester
   EquivalentTo:
transla8on
AND
occurs_in
SOME
mitochondrion

syntax

Cross
Product
(XP)
Sets

•  GO
has
~28k
classes

   –  Retrospec8ve
assignment
of
logical
defini8ons
is
a
lot
of
work

   –  Divide
work
according
to
ontologies
directly
used

•  Cross
Product
par88ons

   –  X
 
<O1
x
O2
x
..
x
On
>


       •  typically
n=2

       •  Genus
taken
from
O1

       •  Differen8ae
taken
from
O2..n




   –  Example:
BP:cysteine_biosynthesis
 
<BP
x
CHEBI>

       •  BP:biosynthesis
that
has_output
CHEBI:cysteine


   –  Each
XP
set
has
one
or
more
templates

       •  Obol
grammars

   –  h:p://wiki.geneontology.org/index.php/Category:Cross_Products

Results:
Logical
defini8ons
per
XP
set

                Genus

                MF
      BP
     CC
    13k
classes
have

    MF
         103
     241
    148
   provisional
logical

                                        defini8ons
(46%
of
classes)


    BP
                  4046
   27

    CC
                  634
    289

    cell
                541
    25

    anatomy
             692

    chemical
   7278
    3072

    protein
             37

    quality
             0

    sequence
            66

    RNA
                 0

GO
Class
             Logical
Defini6on
                          Genus
    Differen6a

                                                                 Ontology
 ontology(s)


S
phase
of
mito6c
    S
phase

and
part_of
mitosis

             BP
        BP

cell
cycle


mitochondrial
        transla6on
and
occurs_in
mitochondrion

 BP
          CC

transla6on

Oocyte
               cell
differen6a6on
and
                     BP
        CL

differen6a6on

        results_in_acquisi.on_of_features_of

                      oocyte


Neural
plate
         anatomical
structure
forma6on
and
         BP
        anatomy

forma6on

            results_in_forma.on_of
neural
plate


Interleukin‐1
        biosynthe6c
process
and
has_output
        BP
        PRO

biosynthesis

        interleukin‐1


L‐cysteine
           catabolic
process
and
has_input
L‐         BP
        CHEBI

catabolic
process
    cysteine
and
has_output
taurine


to
taurine


group
I
intron
       catabolic
process
and
has_input
group
I
   BP
        SO/RNAO

catabolic
process

   intron


GO
Class
           Logical
Defini6on
                         Genus
      Differen6a

                                                              Ontology
   ontology(s)


histone
            protein
complex
and
has_func.on
          CC
         MF

deacetylase
        histone
deacetylase
ac6vity


complex


acrosomal
          membrane
and
surrounds
acrosome

         CC
         CC

membrane


neuron
projec6on
 cell
projec6on
and
part_of
neuron
          CC
         CL


virion
transport
   transport
vesicle
and
realizes
vesicle
   CC
         BP

vesicle
            transport

snoRNP
binding
     binding
and
results_in_binding_of

       MF
         CC

                    snoRNP

methionine
         cataly6c
ac6vity
and

                    MF
         CHEBI

synthase
ac6vity
   has_input

5‐methyltetrahydrofolate

                    and
has_input

L‐homocysteine
and

                    has_output

tetrahydrofolate
and

                    has_output

L‐methionine

Nested
logical
defini8ons

•  Mul8ple
differen8ae
and
nested
descrip8ons

   allowed

    – Only
named
classes
used

    – Spans
XP
sets


GO
Class
           Logical
Defini6on
        Genus
      Differen6a

                                             Ontology
   ontology(s)


nega6ve
regula6on
 biological
process
and
   BP
         BP

of
RNA
metabolic
  has_par.cipant
RNA

process
           metabolic
process

RNA
metabolic
      metabolic
process
and
   BP
         CHEBI

process
            has_par.cipant
RNA

Development
and
anatomy

•  Neural
plate
forma6on
=
anatomical
structure

   forma6on
and
results_in_forma.on_of
neural
plate

   –  GO
annota8ons
to
xenopus,
zebrafish,
mouse

•  Where
is
neural
plate
declared?

   –  Developmental
structures
not
in
scope
of
FMA

   –  Other
choices:

      •  EHDAA
–
mouse
(TS1‐26)

      •  ZFA
‐
zebrafish

      •  TAO
‐
teleost

      •  XAO
‐
xenopus

   –  Gross
anatomical
ontologies
are
species‐or‐taxon‐centric


Uberon:
a
mul8‐species
anatomy

                ontology

•  GO
contains
an
implicit
anatomy
ontology
spanning
mul8ple
species

    –  GO:0007423
!
sensory
organ
development

        •  GO:0001654
!
eye
development



             –  GO:0043010
!
camera‐type
eye
development

             –  GO:0048749
!
compound
eye
development
        sensory
organ

•  Normalized
to
form
Uberon
                                 development

    –  Alignments
with
species‐centric
AOs

    –  3000
classes

    –  See
Poster

•  Current
XP
par88oning:
                                    eye

                                                              development

    –  Uberon
[most
metazoa]

    –  PO
[plants]

    –  Others

        •  Fungal
anatomy
ontology

        •  Dictyosteliam
anatomy
ontology
             compound
eye
         camera‐type

                                                       development
          eye

                                                                             development

Addi8onal
rela8ons
are
required
for

              full
XP
set

•  Core
RO

   – part_of,
has_par.cipant

•  Spa8al
rela8ons
(CC
x
{CC,CL})

   – membranes,
pores

   – adjacent_to,
surrounds,
perforates

•  Par8cipa8on
rela8on
subtypes

   – has_input,
has_output

   – ‘macro’
defined
rela8ons

   – E.g.
results_in_transport_{of,to,from}

Reasoning

•  Reasoning
used
as
part
of
ontology
development
cycle

    –  batch
mode

    –  interac8ve
in
OBO‐Edit2

    –  pre‐reasoned:
inferred
rela8onships
are
asserted

•  Scalability

    –  GO
+
XPs
+
Referenced
ontologies
=
130k
classes

    –  In
memory
reasoners
do
not
scale

    –  h:p://wiki.geneontology.org/index.php/OBO‐
       Edit:Reasoner_Benchmarks

    –  Solu8ons:

        •  Segmenta8on
by
XP
set

        •  CHEBI
slim

        •  RDBMS
based
reasoning

Reasoner
results

•  1000s
of
links
fixed
over
number
years

•  inconsistencies
internal
to
GO
fixed
immediately

   – Fix
hierarchy
of
defined
class

   – Fix
hierarchy
of
referenced
class

      •  abduc8ve
reasoning
(Bada
et
al
OWLED
2008)

   – Fix
logical
defini8on

•  inconsistencies
external
to
GO
take
longer
to
be

   resolved

   – CL

   – CHEBI

BP
x
CHEBI
example

           transport
                                     carbohydrate


      is_a
                  is_a
                                 is_a

                                                          carbohydrate

 nucleo6de,
            carbohydrate
                     phosphates

 nucleobase
or
         transport

 nucleoside
        cabrohydrate
transport
=def
transport
         is_a

 transport
         and
results_in_movement_of
            nucleoside

                    carbohydrate
                          phosphates

   is_a

                                                                   is_a

 nucleo6de

                                                          nucleo6des

 transport


nucleo6de
transport
=def
transport
and

results_in_movement_of
nucleo8de

Release
plan:
basic
and
extended

                releases

•  GO
is
currently
available
in
two
versions

   –  gene_ontology:
“standard”

      •  is_a,
part_of,
intra‐ontology
regulates

      •  intended
for
basic
tools

   –  gene_ontology_ext:
“extended”

      •  h:p://www.geneontology.org/GO.ontology‐ext.rela8ons.shtml

      •  standard
+
other
rela8ons
and
axioms

          –  disjoint_from

          –  has_part
(Aug
1
2009)

•  XP
sets
current
available
as
separate
bridge
files

   –  h:p://wiki.geneontology.org/index.php/
      Category:Cross_Products

   –  will
gradually
migrate
into
gene_ontology_ext


Pre
vs
post
composi8on

•  Compose
class
descrip8ons

    –  During
ontology
development
cycle?

    –  At
the
8me
of
annota8on?

•  Logically
equivalent…

    –  Given
computable
defini8ons,
reasoners
can
determine
equivalency

•  ..
But
very
different
from
prac8cal
point
of
view

•  GO
guidelines

    –  pre‐compose
classes
for
any
type
for
which
scien8fic
generaliza8ons

       can
be
made

        •  Yes:
mitochondrial
transla8on

        •  Yes:
oocyte
nucleus

        •  No:
nucleus
of
epithelium
of
le~
ear



    –  Use
post‐composi8on
to
extend
at
annota8on
8me



Related
work:
weaving
the
fabric
of

          the
OBO
Foundry

•  Ontology
for
Biomedical
Inves8ga8ons
(OBI)

•  Phenotype
Ontologies

   –  Mammalian
Phenotype

   –  Human
Phenotype

   –  Worm
Phenotype

   –  Plant
trait

•  Environment
ontology

•  FMA

•  Fly
anatomy
ontology

   –  Neuronal
subtype
and
sense
organ
logical
defini8ons
using

      CHEBI
and
GO

Future
applica8ons
of
cross‐product

                 sets

•  Demonstrated
u8lity
as
part
of
ontology
development
cycle

   –  How
do
we
evaluate?

   –  but
what
about
actual
applica8ons?

•  How
can
logical
defini8ons
(and
addi8onal
axioma8sa8on
in

   general)
help:

   –  Search
and
discovery

   –  Visualiza8on
and
presenta8on
to
users

   –  Cura8on

   –  Improve
func8on
predic8on

   –  Database
integra8on

       •  E.g.
pathway
databases

   –  Term
enrichment

   –  Seman8c
similarity

•  Need
to
educate
tool
developers

Conclusions

•  Normalizing
retrospec.vely
is
hard

   –  Prospec.ve
approach
recommended

   –  But
redundancy
in
effort
from
alterna8ve
perspec8ve
can
yield

      valuable
informa8on

•  Many
of
the
challenges
are
sociotechnological

   –  What
if
the
referenced
ontology

       •    does
not
yet
exist?

       •    exists
but
is
unfunded?

       •    is
constructed
according
to
different
principles?

       •    is
incomplete?

       •    ..or
there
is
a
choice
of
two
compe8ng
ontologies?

   –  The
OBO
Foundry
process
is
crucial

•  Grant
challenge:
more
applica8ons
needed

Acknowledgments

•  GO
Ontology
Developers
    •  OBO
Ontology
developers

    –    Midori
Harris
           –    Alex
Diehl
(GO,
Cell)

                                  –    Janna
Has8ngs
(CHEBI)

    –    Jane
Lomax
              –    Paula
de
Matos
(CHEBI)

    –    Jen
Deegan
              –    David
Osumi‐Sutherland
(Fly)

    –    Amelia
Ireland
          –    Melissa
Haendel
(Zebrafish)

    –    Tanya
Berardini
         –    Darren
Natale
(PRO)

    –    David
Hill
              –    Karen
Eilbeck
(SO)


•  Also
                      •  OBO‐Edit

                                  •  Amina
Abdulla

    –  Mike
Bada

    –  Colin
Batchelor
           •  Nomi
Harris

                                  •  John
Day‐Richter

•  OBO

                              •  GO
PIs

    –  Alan
Ru:enberg
            –    Suzanna
Lewis

    –  Barry
Smith
               –    Mike
Cherry

    –  Richard
Scheuermann
       –    Michael
Ashburner

                                  –    Judith
Blake


More Related Content

Similar to Cross Product Extensions to the Gene Ontology

Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
Abhik Seal
 
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
Barry Smith
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
Chris Evelo
 
Unifying ontology services for functional genomic annotations
Unifying ontology services for functional genomic annotationsUnifying ontology services for functional genomic annotations
Unifying ontology services for functional genomic annotations
Tomasz Adamusiak
 

Similar to Cross Product Extensions to the Gene Ontology (20)

2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge
 
Bioinformatics t7-proteinstructure v2014
Bioinformatics t7-proteinstructure v2014Bioinformatics t7-proteinstructure v2014
Bioinformatics t7-proteinstructure v2014
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...
 
Function and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge FusionFunction and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge Fusion
 
soutenance
soutenancesoutenance
soutenance
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
BMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
BMI 201 - Investigating Term Reuse and Overlap in Biomedical OntologiesBMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
BMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
 
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
 
Building a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jBuilding a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4j
 
Bioinformatica 01-12-2011-t7-protein
Bioinformatica 01-12-2011-t7-proteinBioinformatica 01-12-2011-t7-protein
Bioinformatica 01-12-2011-t7-protein
 
Copy of biocuration 2017
Copy of biocuration 2017Copy of biocuration 2017
Copy of biocuration 2017
 
Barcelona sabatica
Barcelona sabaticaBarcelona sabatica
Barcelona sabatica
 
Unifying ontology services for functional genomic annotations
Unifying ontology services for functional genomic annotationsUnifying ontology services for functional genomic annotations
Unifying ontology services for functional genomic annotations
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013
 
Molecular chaperones
Molecular chaperonesMolecular chaperones
Molecular chaperones
 

More from Chris Mungall

More from Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Cross Product Extensions to the Gene Ontology

  • 1. Cross
Product
Extensions
to
the
 Gene
Ontology
 Chris
Mungall
 Gene
Ontology
Consor8um
 h:p://www.geneontology.org

  • 2. Outline
 •  What
the
Gene
Ontology
is
used
for
 –  GO
structure
 –  Limita8ons
of
text
defini8ons
 •  Cross‐product
extensions
to
the
GO
 –  Logical
computable
defini8ons
 •  Results
and
Examples
 –  Chemical
en88es,
proteins,
cells
 –  Anatomy
and
development
 –  Rela8ons
 –  Reasoning
 •  Release
Plan
 •  Conclusions

  • 3. A
brief
introduc8on
to
the
GO
 •  Nearing
11th
birthday
 •  3
ontologies,
28k
classes
 –  Molecular
Func8on
(MF)
 –  Biological
Process
(BP)
 –  Cellular
Component
(CC)
 •  Annota8ons
 –  42m
statements
assigning
func8on
or
localiza8on
to
genes
across
187k
species

 •  Standard
uses
of
GO
annota8on:
 –  Naviga8ng
and
querying
func8onal
annota8ons
for
genes
 –  Discovery;
term
enrichment;
seman8c
similarity
 –  >50
tools
for
performing
hi‐throughput
analysis
using
GO
 •  Most
uses
require
a
simple,
lightly
axioma8zed
graph
 –  is_a
 –  part_of
 –  Defini8ons
are
textual

  • 4. Problems
and
limita8ons

 •  Maintenance
and
 errors
 – Combinatorial
terms
 – Tangled
 polyhierarchies
 •  Denormalized
 – Redundancy

 – lack
of
reuse

  • 5. Solu8on:
normaliza8on
+
reasoning
 •  Prior
work
 metabolism sulfur amino acid –  Rector
et
al
 x
 –  Hill
et
al
 biosynthesis cysteine •  Retrospec.ve
 sulfur amino normaliza8on
 acid metabolism –  GO
preceded
OBO
 •  How?
 cysteine sulfur amino acid –  GONG,
Wroe
et
al
 =
 metabolism biosynthesis –  Ogren
et
al
 –  Obol
 cysteine biosynthesis
  • 6. Assigning
logical
defini8ons
to
GO
 classes
 •  Logical
defini8on
structure
 –  An
X
is
a
G
that
D
 •  X
:
defined
term
 •  G
:
genus
(parent)
term
 •  D
:
differen8a(e)
–
discrimina8ng
rela8onships
 –  Necessary
and
sufficient
condi8ons
 –  Computable
defini6on
should
mirror
text
defini6on
 •  Simple
formalism,
limited
expressivity
 –  Equivalence
axioms
between
named
classes
and
posi8ve
conjunc8ons
 of
named
class
and
one
or
more
existen8al
restric8ons
 •  OBO
priniciple
of
Posi.vity
 –  General
template:
 •  EquivalentClasses(NamedClass
intersec8onOf(NamedGenus
 [someValuesFrom(NamedObjectProperty
NamedDifferen.aClass)]+))

  • 7. Example:
mitochondrial
transla8on
 •  ‘mitochondrial
transla8on’
=def
‘transla8on’
that
 occurs_in
‘mitochondrion’
 – (current
rela8onships
in
GO
are
necessary
condi8ons
 only)

 OBO
 id: GO:0032543 name: mitochondrial translation intersection_of: GO:0006412 ! translation intersection_of: occurs_in GO:0005739 ! mitochondrion FOL
 X
instance_of
‘mitochondrial
transla8on’
<‐>
 

X
instance_of
transla8on
&
 


exists
C,t
[
C
instance_of
mitochondrion
at
t
&
X
occurs_in
C
at
t
]
 OWL
 Class:
‘mitochondrial
transla8on’
 manchester
 EquivalentTo:
transla8on
AND
occurs_in
SOME
mitochondrion
 syntax

  • 8. Cross
Product
(XP)
Sets
 •  GO
has
~28k
classes
 –  Retrospec8ve
assignment
of
logical
defini8ons
is
a
lot
of
work
 –  Divide
work
according
to
ontologies
directly
used
 •  Cross
Product
par88ons
 –  X
 
<O1
x
O2
x
..
x
On
>

 •  typically
n=2
 •  Genus
taken
from
O1
 •  Differen8ae
taken
from
O2..n



 –  Example:
BP:cysteine_biosynthesis
 
<BP
x
CHEBI>
 •  BP:biosynthesis
that
has_output
CHEBI:cysteine

 –  Each
XP
set
has
one
or
more
templates
 •  Obol
grammars
 –  h:p://wiki.geneontology.org/index.php/Category:Cross_Products

  • 9. Results:
Logical
defini8ons
per
XP
set
 Genus
 MF
 BP
 CC
 13k
classes
have
 MF
 103
 241
 148
 provisional
logical
 defini8ons
(46%
of
classes)

 BP
 4046
 27
 CC
 634
 289
 cell
 541
 25
 anatomy
 692
 chemical
 7278
 3072
 protein
 37
 quality
 0
 sequence
 66
 RNA
 0

  • 10. GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 S
phase
of
mito6c
 S
phase

and
part_of
mitosis

 BP
 BP
 cell
cycle

 mitochondrial
 transla6on
and
occurs_in
mitochondrion

 BP
 CC
 transla6on
 Oocyte
 cell
differen6a6on
and
 BP
 CL
 differen6a6on

 results_in_acquisi.on_of_features_of
 oocyte

 Neural
plate
 anatomical
structure
forma6on
and
 BP
 anatomy
 forma6on

 results_in_forma.on_of
neural
plate

 Interleukin‐1
 biosynthe6c
process
and
has_output
 BP
 PRO
 biosynthesis

 interleukin‐1

 L‐cysteine
 catabolic
process
and
has_input
L‐ BP
 CHEBI
 catabolic
process
 cysteine
and
has_output
taurine

 to
taurine

 group
I
intron
 catabolic
process
and
has_input
group
I
 BP
 SO/RNAO
 catabolic
process

 intron


  • 11. GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 histone
 protein
complex
and
has_func.on
 CC
 MF
 deacetylase
 histone
deacetylase
ac6vity

 complex

 acrosomal
 membrane
and
surrounds
acrosome

 CC
 CC
 membrane

 neuron
projec6on
 cell
projec6on
and
part_of
neuron
 CC
 CL
 virion
transport
 transport
vesicle
and
realizes
vesicle
 CC
 BP
 vesicle
 transport
 snoRNP
binding
 binding
and
results_in_binding_of

 MF
 CC
 snoRNP
 methionine
 cataly6c
ac6vity
and

 MF
 CHEBI
 synthase
ac6vity
 has_input

5‐methyltetrahydrofolate
 and
has_input

L‐homocysteine
and
 has_output

tetrahydrofolate
and
 has_output

L‐methionine

  • 12. Nested
logical
defini8ons
 •  Mul8ple
differen8ae
and
nested
descrip8ons
 allowed
 – Only
named
classes
used
 – Spans
XP
sets
 GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 nega6ve
regula6on
 biological
process
and
 BP
 BP
 of
RNA
metabolic
 has_par.cipant
RNA
 process
 metabolic
process
 RNA
metabolic
 metabolic
process
and
 BP
 CHEBI
 process
 has_par.cipant
RNA

  • 13. Development
and
anatomy
 •  Neural
plate
forma6on
=
anatomical
structure
 forma6on
and
results_in_forma.on_of
neural
plate
 –  GO
annota8ons
to
xenopus,
zebrafish,
mouse
 •  Where
is
neural
plate
declared?
 –  Developmental
structures
not
in
scope
of
FMA
 –  Other
choices:
 •  EHDAA
–
mouse
(TS1‐26)
 •  ZFA
‐
zebrafish
 •  TAO
‐
teleost
 •  XAO
‐
xenopus
 –  Gross
anatomical
ontologies
are
species‐or‐taxon‐centric


  • 14. Uberon:
a
mul8‐species
anatomy
 ontology
 •  GO
contains
an
implicit
anatomy
ontology
spanning
mul8ple
species
 –  GO:0007423
!
sensory
organ
development
 •  GO:0001654
!
eye
development


 –  GO:0043010
!
camera‐type
eye
development
 –  GO:0048749
!
compound
eye
development
 sensory
organ
 •  Normalized
to
form
Uberon
 development
 –  Alignments
with
species‐centric
AOs
 –  3000
classes
 –  See
Poster
 •  Current
XP
par88oning:
 eye
 development
 –  Uberon
[most
metazoa]
 –  PO
[plants]
 –  Others
 •  Fungal
anatomy
ontology
 •  Dictyosteliam
anatomy
ontology
 compound
eye
 camera‐type
 development
 eye
 development

  • 15. Addi8onal
rela8ons
are
required
for
 full
XP
set
 •  Core
RO
 – part_of,
has_par.cipant
 •  Spa8al
rela8ons
(CC
x
{CC,CL})
 – membranes,
pores
 – adjacent_to,
surrounds,
perforates
 •  Par8cipa8on
rela8on
subtypes
 – has_input,
has_output
 – ‘macro’
defined
rela8ons
 – E.g.
results_in_transport_{of,to,from}

  • 16. Reasoning
 •  Reasoning
used
as
part
of
ontology
development
cycle
 –  batch
mode
 –  interac8ve
in
OBO‐Edit2
 –  pre‐reasoned:
inferred
rela8onships
are
asserted
 •  Scalability
 –  GO
+
XPs
+
Referenced
ontologies
=
130k
classes
 –  In
memory
reasoners
do
not
scale
 –  h:p://wiki.geneontology.org/index.php/OBO‐ Edit:Reasoner_Benchmarks
 –  Solu8ons:
 •  Segmenta8on
by
XP
set
 •  CHEBI
slim
 •  RDBMS
based
reasoning

  • 17. Reasoner
results
 •  1000s
of
links
fixed
over
number
years
 •  inconsistencies
internal
to
GO
fixed
immediately
 – Fix
hierarchy
of
defined
class
 – Fix
hierarchy
of
referenced
class
 •  abduc8ve
reasoning
(Bada
et
al
OWLED
2008)
 – Fix
logical
defini8on
 •  inconsistencies
external
to
GO
take
longer
to
be
 resolved
 – CL
 – CHEBI

  • 18. BP
x
CHEBI
example
 transport
 carbohydrate
 is_a
 is_a
 is_a
 carbohydrate
 nucleo6de,
 carbohydrate
 phosphates
 nucleobase
or
 transport
 nucleoside
 cabrohydrate
transport
=def
transport
 is_a
 transport
 and
results_in_movement_of
 nucleoside
 carbohydrate
 phosphates
 is_a
 is_a
 nucleo6de
 nucleo6des
 transport
 nucleo6de
transport
=def
transport
and
 results_in_movement_of
nucleo8de

  • 19. Release
plan:
basic
and
extended
 releases
 •  GO
is
currently
available
in
two
versions
 –  gene_ontology:
“standard”
 •  is_a,
part_of,
intra‐ontology
regulates
 •  intended
for
basic
tools
 –  gene_ontology_ext:
“extended”
 •  h:p://www.geneontology.org/GO.ontology‐ext.rela8ons.shtml
 •  standard
+
other
rela8ons
and
axioms
 –  disjoint_from
 –  has_part
(Aug
1
2009)
 •  XP
sets
current
available
as
separate
bridge
files
 –  h:p://wiki.geneontology.org/index.php/ Category:Cross_Products
 –  will
gradually
migrate
into
gene_ontology_ext


  • 20. Pre
vs
post
composi8on
 •  Compose
class
descrip8ons
 –  During
ontology
development
cycle?
 –  At
the
8me
of
annota8on?
 •  Logically
equivalent…
 –  Given
computable
defini8ons,
reasoners
can
determine
equivalency
 •  ..
But
very
different
from
prac8cal
point
of
view
 •  GO
guidelines
 –  pre‐compose
classes
for
any
type
for
which
scien8fic
generaliza8ons
 can
be
made
 •  Yes:
mitochondrial
transla8on
 •  Yes:
oocyte
nucleus
 •  No:
nucleus
of
epithelium
of
le~
ear


 –  Use
post‐composi8on
to
extend
at
annota8on
8me



  • 21. Related
work:
weaving
the
fabric
of
 the
OBO
Foundry
 •  Ontology
for
Biomedical
Inves8ga8ons
(OBI)
 •  Phenotype
Ontologies
 –  Mammalian
Phenotype
 –  Human
Phenotype
 –  Worm
Phenotype
 –  Plant
trait
 •  Environment
ontology
 •  FMA
 •  Fly
anatomy
ontology
 –  Neuronal
subtype
and
sense
organ
logical
defini8ons
using
 CHEBI
and
GO

  • 22. Future
applica8ons
of
cross‐product
 sets
 •  Demonstrated
u8lity
as
part
of
ontology
development
cycle
 –  How
do
we
evaluate?
 –  but
what
about
actual
applica8ons?
 •  How
can
logical
defini8ons
(and
addi8onal
axioma8sa8on
in
 general)
help:
 –  Search
and
discovery
 –  Visualiza8on
and
presenta8on
to
users
 –  Cura8on
 –  Improve
func8on
predic8on
 –  Database
integra8on
 •  E.g.
pathway
databases
 –  Term
enrichment
 –  Seman8c
similarity
 •  Need
to
educate
tool
developers

  • 23. Conclusions
 •  Normalizing
retrospec.vely
is
hard
 –  Prospec.ve
approach
recommended
 –  But
redundancy
in
effort
from
alterna8ve
perspec8ve
can
yield
 valuable
informa8on
 •  Many
of
the
challenges
are
sociotechnological
 –  What
if
the
referenced
ontology
 •  does
not
yet
exist?
 •  exists
but
is
unfunded?
 •  is
constructed
according
to
different
principles?
 •  is
incomplete?
 •  ..or
there
is
a
choice
of
two
compe8ng
ontologies?
 –  The
OBO
Foundry
process
is
crucial
 •  Grant
challenge:
more
applica8ons
needed

  • 24. Acknowledgments
 •  GO
Ontology
Developers
 •  OBO
Ontology
developers
 –  Midori
Harris
 –  Alex
Diehl
(GO,
Cell)
 –  Janna
Has8ngs
(CHEBI)
 –  Jane
Lomax
 –  Paula
de
Matos
(CHEBI)
 –  Jen
Deegan
 –  David
Osumi‐Sutherland
(Fly)
 –  Amelia
Ireland
 –  Melissa
Haendel
(Zebrafish)
 –  Tanya
Berardini
 –  Darren
Natale
(PRO)
 –  David
Hill
 –  Karen
Eilbeck
(SO)
 •  Also
 •  OBO‐Edit
 •  Amina
Abdulla
 –  Mike
Bada
 –  Colin
Batchelor
 •  Nomi
Harris
 •  John
Day‐Richter
 •  OBO
 •  GO
PIs
 –  Alan
Ru:enberg
 –  Suzanna
Lewis
 –  Barry
Smith
 –  Mike
Cherry
 –  Richard
Scheuermann
 –  Michael
Ashburner
 –  Judith
Blake