Cross Product Extensions to the Gene Ontology

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

1 comments

Comments 1 - 1 of 1 previous next Post a comment

Post a comment
Embed Video
Edit your comment Cancel

Favorites, Groups & Events

Cross Product Extensions to the Gene Ontology - Presentation Transcript

  1. Cross
Product
Extensions
to
the
 Gene
Ontology
 Chris
Mungall
 Gene
Ontology
Consor8um
 h:p://www.geneontology.org

  2. Outline
 •  What
the
Gene
Ontology
is
used
for
 –  GO
structure
 –  Limita8ons
of
text
defini8ons
 •  Cross‐product
extensions
to
the
GO
 –  Logical
computable
defini8ons
 •  Results
and
Examples
 –  Chemical
en88es,
proteins,
cells
 –  Anatomy
and
development
 –  Rela8ons
 –  Reasoning
 •  Release
Plan
 •  Conclusions

  3. A
brief
introduc8on
to
the
GO
 •  Nearing
11th
birthday
 •  3
ontologies,
28k
classes
 –  Molecular
Func8on
(MF)
 –  Biological
Process
(BP)
 –  Cellular
Component
(CC)
 •  Annota8ons
 –  42m
statements
assigning
func8on
or
localiza8on
to
genes
across
187k
species

 •  Standard
uses
of
GO
annota8on:
 –  Naviga8ng
and
querying
func8onal
annota8ons
for
genes
 –  Discovery;
term
enrichment;
seman8c
similarity
 –  >50
tools
for
performing
hi‐throughput
analysis
using
GO
 •  Most
uses
require
a
simple,
lightly
axioma8zed
graph
 –  is_a
 –  part_of
 –  Defini8ons
are
textual

  4. Problems
and
limita8ons

 •  Maintenance
and
 errors
 – Combinatorial
terms
 – Tangled
 polyhierarchies
 •  Denormalized
 – Redundancy

 – lack
of
reuse

  5. Solu8on:
normaliza8on
+
reasoning
 •  Prior
work
 metabolism sulfur amino acid –  Rector
et
al
 x
 –  Hill
et
al
 biosynthesis cysteine •  Retrospec.ve
 sulfur amino normaliza8on
 acid metabolism –  GO
preceded
OBO
 •  How?
 cysteine sulfur amino acid –  GONG,
Wroe
et
al
 =
 metabolism biosynthesis –  Ogren
et
al
 –  Obol
 cysteine biosynthesis
  6. Assigning
logical
defini8ons
to
GO
 classes
 •  Logical
defini8on
structure
 –  An
X
is
a
G
that
D
 •  X
:
defined
term
 •  G
:
genus
(parent)
term
 •  D
:
differen8a(e)
–
discrimina8ng
rela8onships
 –  Necessary
and
sufficient
condi8ons
 –  Computable
defini6on
should
mirror
text
defini6on
 •  Simple
formalism,
limited
expressivity
 –  Equivalence
axioms
between
named
classes
and
posi8ve
conjunc8ons
 of
named
class
and
one
or
more
existen8al
restric8ons
 •  OBO
priniciple
of
Posi.vity
 –  General
template:
 •  EquivalentClasses(NamedClass
intersec8onOf(NamedGenus
 [someValuesFrom(NamedObjectProperty
NamedDifferen.aClass)]+))

  7. Example:
mitochondrial
transla8on
 •  ‘mitochondrial
transla8on’
=def
‘transla8on’
that
 occurs_in
‘mitochondrion’
 – (current
rela8onships
in
GO
are
necessary
condi8ons
 only)

 OBO
 id: GO:0032543 name: mitochondrial translation intersection_of: GO:0006412 ! translation intersection_of: occurs_in GO:0005739 ! mitochondrion FOL
 X
instance_of
‘mitochondrial
transla8on’
<‐>
 

X
instance_of
transla8on
&
 


exists
C,t
[
C
instance_of
mitochondrion
at
t
&
X
occurs_in
C
at
t
]
 OWL
 Class:
‘mitochondrial
transla8on’
 manchester
 EquivalentTo:
transla8on
AND
occurs_in
SOME
mitochondrion
 syntax

  8. Cross
Product
(XP)
Sets
 •  GO
has
~28k
classes
 –  Retrospec8ve
assignment
of
logical
defini8ons
is
a
lot
of
work
 –  Divide
work
according
to
ontologies
directly
used
 •  Cross
Product
par88ons
 –  X
 
<O1
x
O2
x
..
x
On
>

 •  typically
n=2
 •  Genus
taken
from
O1
 •  Differen8ae
taken
from
O2..n



 –  Example:
BP:cysteine_biosynthesis
 
<BP
x
CHEBI>
 •  BP:biosynthesis
that
has_output
CHEBI:cysteine

 –  Each
XP
set
has
one
or
more
templates
 •  Obol
grammars
 –  h:p://wiki.geneontology.org/index.php/Category:Cross_Products

  9. Results:
Logical
defini8ons
per
XP
set
 Genus
 MF
 BP
 CC
 13k
classes
have
 MF
 103
 241
 148
 provisional
logical
 defini8ons
(46%
of
classes)

 BP
 4046
 27
 CC
 634
 289
 cell
 541
 25
 anatomy
 692
 chemical
 7278
 3072
 protein
 37
 quality
 0
 sequence
 66
 RNA
 0

  10. GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 S
phase
of
mito6c
 S
phase

and
part_of
mitosis

 BP
 BP
 cell
cycle

 mitochondrial
 transla6on
and
occurs_in
mitochondrion

 BP
 CC
 transla6on
 Oocyte
 cell
differen6a6on
and
 BP
 CL
 differen6a6on

 results_in_acquisi.on_of_features_of
 oocyte

 Neural
plate
 anatomical
structure
forma6on
and
 BP
 anatomy
 forma6on

 results_in_forma.on_of
neural
plate

 Interleukin‐1
 biosynthe6c
process
and
has_output
 BP
 PRO
 biosynthesis

 interleukin‐1

 L‐cysteine
 catabolic
process
and
has_input
L‐ BP
 CHEBI
 catabolic
process
 cysteine
and
has_output
taurine

 to
taurine

 group
I
intron
 catabolic
process
and
has_input
group
I
 BP
 SO/RNAO
 catabolic
process

 intron


  11. GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 histone
 protein
complex
and
has_func.on
 CC
 MF
 deacetylase
 histone
deacetylase
ac6vity

 complex

 acrosomal
 membrane
and
surrounds
acrosome

 CC
 CC
 membrane

 neuron
projec6on
 cell
projec6on
and
part_of
neuron
 CC
 CL
 virion
transport
 transport
vesicle
and
realizes
vesicle
 CC
 BP
 vesicle
 transport
 snoRNP
binding
 binding
and
results_in_binding_of

 MF
 CC
 snoRNP
 methionine
 cataly6c
ac6vity
and

 MF
 CHEBI
 synthase
ac6vity
 has_input

5‐methyltetrahydrofolate
 and
has_input

L‐homocysteine
and
 has_output

tetrahydrofolate
and
 has_output

L‐methionine

  12. Nested
logical
defini8ons
 •  Mul8ple
differen8ae
and
nested
descrip8ons
 allowed
 – Only
named
classes
used
 – Spans
XP
sets
 GO
Class
 Logical
Defini6on
 Genus
 Differen6a
 Ontology
 ontology(s)
 nega6ve
regula6on
 biological
process
and
 BP
 BP
 of
RNA
metabolic
 has_par.cipant
RNA
 process
 metabolic
process
 RNA
metabolic
 metabolic
process
and
 BP
 CHEBI
 process
 has_par.cipant
RNA

  13. Development
and
anatomy
 •  Neural
plate
forma6on
=
anatomical
structure
 forma6on
and
results_in_forma.on_of
neural
plate
 –  GO
annota8ons
to
xenopus,
zebrafish,
mouse
 •  Where
is
neural
plate
declared?
 –  Developmental
structures
not
in
scope
of
FMA
 –  Other
choices:
 •  EHDAA
–
mouse
(TS1‐26)
 •  ZFA
‐
zebrafish
 •  TAO
‐
teleost
 •  XAO
‐
xenopus
 –  Gross
anatomical
ontologies
are
species‐or‐taxon‐centric


  14. Uberon:
a
mul8‐species
anatomy
 ontology
 •  GO
contains
an
implicit
anatomy
ontology
spanning
mul8ple
species
 –  GO:0007423
!
sensory
organ
development
 •  GO:0001654
!
eye
development


 –  GO:0043010
!
camera‐type
eye
development
 –  GO:0048749
!
compound
eye
development
 sensory
organ
 •  Normalized
to
form
Uberon
 development
 –  Alignments
with
species‐centric
AOs
 –  3000
classes
 –  See
Poster
 •  Current
XP
par88oning:
 eye
 development
 –  Uberon
[most
metazoa]
 –  PO
[plants]
 –  Others
 •  Fungal
anatomy
ontology
 •  Dictyosteliam
anatomy
ontology
 compound
eye
 camera‐type
 development
 eye
 development

  15. Addi8onal
rela8ons
are
required
for
 full
XP
set
 •  Core
RO
 – part_of,
has_par.cipant
 •  Spa8al
rela8ons
(CC
x
{CC,CL})
 – membranes,
pores
 – adjacent_to,
surrounds,
perforates
 •  Par8cipa8on
rela8on
subtypes
 – has_input,
has_output
 – ‘macro’
defined
rela8ons
 – E.g.
results_in_transport_{of,to,from}

  16. Reasoning
 •  Reasoning
used
as
part
of
ontology
development
cycle
 –  batch
mode
 –  interac8ve
in
OBO‐Edit2
 –  pre‐reasoned:
inferred
rela8onships
are
asserted
 •  Scalability
 –  GO
+
XPs
+
Referenced
ontologies
=
130k
classes
 –  In
memory
reasoners
do
not
scale
 –  h:p://wiki.geneontology.org/index.php/OBO‐ Edit:Reasoner_Benchmarks
 –  Solu8ons:
 •  Segmenta8on
by
XP
set
 •  CHEBI
slim
 •  RDBMS
based
reasoning

  17. Reasoner
results
 •  1000s
of
links
fixed
over
number
years
 •  inconsistencies
internal
to
GO
fixed
immediately
 – Fix
hierarchy
of
defined
class
 – Fix
hierarchy
of
referenced
class
 •  abduc8ve
reasoning
(Bada
et
al
OWLED
2008)
 – Fix
logical
defini8on
 •  inconsistencies
external
to
GO
take
longer
to
be
 resolved
 – CL
 – CHEBI

  18. BP
x
CHEBI
example
 transport
 carbohydrate
 is_a
 is_a
 is_a
 carbohydrate
 nucleo6de,
 carbohydrate
 phosphates
 nucleobase
or
 transport
 nucleoside
 cabrohydrate
transport
=def
transport
 is_a
 transport
 and
results_in_movement_of
 nucleoside
 carbohydrate
 phosphates
 is_a
 is_a
 nucleo6de
 nucleo6des
 transport
 nucleo6de
transport
=def
transport
and
 results_in_movement_of
nucleo8de

  19. Release
plan:
basic
and
extended
 releases
 •  GO
is
currently
available
in
two
versions
 –  gene_ontology:
“standard”
 •  is_a,
part_of,
intra‐ontology
regulates
 •  intended
for
basic
tools
 –  gene_ontology_ext:
“extended”
 •  h:p://www.geneontology.org/GO.ontology‐ext.rela8ons.shtml
 •  standard
+
other
rela8ons
and
axioms
 –  disjoint_from
 –  has_part
(Aug
1
2009)
 •  XP
sets
current
available
as
separate
bridge
files
 –  h:p://wiki.geneontology.org/index.php/ Category:Cross_Products
 –  will
gradually
migrate
into
gene_ontology_ext


  20. Pre
vs
post
composi8on
 •  Compose
class
descrip8ons
 –  During
ontology
development
cycle?
 –  At
the
8me
of
annota8on?
 •  Logically
equivalent…
 –  Given
computable
defini8ons,
reasoners
can
determine
equivalency
 •  ..
But
very
different
from
prac8cal
point
of
view
 •  GO
guidelines
 –  pre‐compose
classes
for
any
type
for
which
scien8fic
generaliza8ons
 can
be
made
 •  Yes:
mitochondrial
transla8on
 •  Yes:
oocyte
nucleus
 •  No:
nucleus
of
epithelium
of
le~
ear


 –  Use
post‐composi8on
to
extend
at
annota8on
8me



  21. Related
work:
weaving
the
fabric
of
 the
OBO
Foundry
 •  Ontology
for
Biomedical
Inves8ga8ons
(OBI)
 •  Phenotype
Ontologies
 –  Mammalian
Phenotype
 –  Human
Phenotype
 –  Worm
Phenotype
 –  Plant
trait
 •  Environment
ontology
 •  FMA
 •  Fly
anatomy
ontology
 –  Neuronal
subtype
and
sense
organ
logical
defini8ons
using
 CHEBI
and
GO

  22. Future
applica8ons
of
cross‐product
 sets
 •  Demonstrated
u8lity
as
part
of
ontology
development
cycle
 –  How
do
we
evaluate?
 –  but
what
about
actual
applica8ons?
 •  How
can
logical
defini8ons
(and
addi8onal
axioma8sa8on
in
 general)
help:
 –  Search
and
discovery
 –  Visualiza8on
and
presenta8on
to
users
 –  Cura8on
 –  Improve
func8on
predic8on
 –  Database
integra8on
 •  E.g.
pathway
databases
 –  Term
enrichment
 –  Seman8c
similarity
 •  Need
to
educate
tool
developers

  23. Conclusions
 •  Normalizing
retrospec.vely
is
hard
 –  Prospec.ve
approach
recommended
 –  But
redundancy
in
effort
from
alterna8ve
perspec8ve
can
yield
 valuable
informa8on
 •  Many
of
the
challenges
are
sociotechnological
 –  What
if
the
referenced
ontology
 •  does
not
yet
exist?
 •  exists
but
is
unfunded?
 •  is
constructed
according
to
different
principles?
 •  is
incomplete?
 •  ..or
there
is
a
choice
of
two
compe8ng
ontologies?
 –  The
OBO
Foundry
process
is
crucial
 •  Grant
challenge:
more
applica8ons
needed

  24. Acknowledgments
 •  GO
Ontology
Developers
 •  OBO
Ontology
developers
 –  Midori
Harris
 –  Alex
Diehl
(GO,
Cell)
 –  Janna
Has8ngs
(CHEBI)
 –  Jane
Lomax
 –  Paula
de
Matos
(CHEBI)
 –  Jen
Deegan
 –  David
Osumi‐Sutherland
(Fly)
 –  Amelia
Ireland
 –  Melissa
Haendel
(Zebrafish)
 –  Tanya
Berardini
 –  Darren
Natale
(PRO)
 –  David
Hill
 –  Karen
Eilbeck
(SO)
 •  Also
 •  OBO‐Edit
 •  Amina
Abdulla
 –  Mike
Bada
 –  Colin
Batchelor
 •  Nomi
Harris
 •  John
Day‐Richter
 •  OBO
 •  GO
PIs
 –  Alan
Ru:enberg
 –  Suzanna
Lewis
 –  Barry
Smith
 –  Mike
Cherry
 –  Richard
Scheuermann
 –  Michael
Ashburner
 –  Judith
Blake

SlideShare Zeitgeist 2009

+ cmungallcmungall Nominate

custom

536 views, 0 favs, 0 embeds more stats

More info about this document

© All Rights Reserved

Go to text version

  • Total Views 536
    • 536 on SlideShare
    • 0 from embeds
  • Comments 1
  • Favorites 0
  • Downloads 0
Most viewed embeds

more

All embeds

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories