Authors: J.T. Fernandez-Breis, L. Iannone, I. Palmisano, A. Rector, R. Stevens.
Presented at 17th International Conference on Knowledge Engineering and Knowledge Management, EKAW2010
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Enriching the Gene Ontology via the Dissection of Labels using the Ontology Pre-Processor Language
1. Enriching
the
Gene
Ontology
via
the
Dissec4on
of
Labels
using
the
Ontology
Pre-‐Processor
Language
Jesualdo
Tomás
Fernández-‐Breis,
Luigi
Iannone,
Ignazio
Palmisano,
Alan
L.
Rector,
and
Robert
Stevens
October
12th
2010,
Lisbon,
Portugal
2. Mo4va4on
• Biomedical
Ontologies
– The
OBO
Foundry
• More
than
200
biomedical
ontologies
• Some
proper4es
– Delineated
content
– Reuse
of
exis4ng
ontologies
– Textual
defini4ons
– Systema4c
naming
conven4on
• Limited
explicit
seman4cs
5. Enrichment
of
GO
Molecular
Func4on
Dissec4on
of
Analysis
of
Original
GO
MF
the
Ontology
Labels
Execu4on
of
Design
of
Iden4fica4on
the
Knowledge
Knowledge
of
Linguis4c
PaQerns
PaQerns
PaQerns
Enriched
GO
MF
6. Dissec4on
of
the
ontology
into
its
seman4c
axes
• Normaliza4on
• Analysis
of
the
labels
– Biochemical
substances
– Biological
processes
– Cellular
component
• Reuse
and
combina4on
of
exis4ng
ontologies
8. Design
of
linguis4c
paQerns
from
labels
• Manual
analysis
of
the
structure
of
the
labels
by
taxonomies
• Some
linguis4c
paQerns
– “X
binding”
– “X
codon
amino
acid
adaptor
ac4vity”
– “base
pairing
with
X”
– “transla4on
X
factor
ac4vity”
9. Design
of
knowledge
paQerns
• Some
knowledge
paQerns
binding
=
molecular_func,on
and
enables
some
(binds
some
chemical_substance
or
binds
some
cellular_component)
triplet_codon_amino_acid_adaptor_ac4vity=
molecular_func,on
and
enables
some
(adapts
some
(amino_acid
and
recognizes
some
triplet))
10. Execu4on
of
the
knowledge
paQerns
• OPPL
Version
2
– hQp://oppl2.sourceforge.net/
• Bulk
manipula4on
of
OWL
ontologies
– Enrichment,
Verifica4on,
PaQerns
– Manchester
OWL
Syntax
• Declara4ve
– OWL
Axioms,
variables,
regular
expressions
11. OPPL
Use
case
Values
OPPL
Script
Lean
Rich
OWL
axioms
Egaña
et
al.
OWLED
2008
&
EKAW
2008,
Iannone
ESWC
2009
12. A
paQern
as
an
OPPL
script
?y:CLASS=Match("((w+))_codon_amino_acid_adaptor_ac4vity"),
?x:CLASS=create(?y.GROUPS(1))
SELECT
?y
subClassOf
Thing
WHERE
?y
Match("((w+))_codon_amino_acid_adaptor_ac4vity")
BEGIN
ADD
?y
subClassOf
molecular_func4on,
ADD
?y
subClassOf
enables
some
(adapts
some
(amino_acid
and
recognizes
some
?x))
END;
13. Results-‐
Scope
• The
“source”
Gene
Ontology
– Version
1550
–
8548
classes,
5
OP,
5
DP
and
9954
subclass
axioms
– Classifica4on
4me
:
<
1
sec
(Fact++)
• Scope
of
this
study
(approx
18%
GO
MF)
– binding
– structural
molecule
ac4vity
– chaperone
ac4vity
– proteasome
regulator
ac4vity
– electron
carrier
ac4vity
– enzyme
regulator
ac4vity
– transla4on
regulator
ac4vity
• Complete
results:
hQp://miuras.inf.um.es/~mfoppl/
16. Results-‐
Enrichment
(II)
• The
enriched
GO
MF
– 58624
classes,
254
OP,
16
DP,
107631
subclass
axioms,
264
equivalent
class
axioms
and
488
disjoint
class
axioms
– Classifica4on
4me:
approx
2
minutes
(Fact++)
–
Due
to
the
paQerns
• 584
new
classes
– Subop4mal
auxiliary
ontologies:
D1
Dopamine
– Use
of
abbreviated
forms
in
GO
MF:
MAPK,
IgX
• 13
new
OP
• 3608
new
subclass
axioms
17. Results-‐
Querying
(III)
• We
can
make
queries
that
were
not
possible
with
the
original
ontology:
– Example:
Molecular
func/ons
that
bind
substances
that
play
a
chemical
role
18. Results-‐
Findings
(II)
• We
can
make
queries
that
were
not
possible
with
the
original
ontology:
– Example:
Molecular
func/ons
that
bind
substances
that
play
a
chemical
role
20. Conclusions
• PaQerns
and
OPPL
are
useful
for
suppor4ng
ontology
enrichment
processes
• The
structure
of
the
labels
in
biomedical
ontologies
embeds
knowledge
that
can
be
extracted
• Benefits
of
encoding
knowledge
into
paQerns:
modularity,
maintenance
and
evolu4on
• Cri4cal
factor:
the
auxiliary
ontologies
21. Further
work
• Bio-‐evalua4on
of
the
paQerns
• Iden4fica4on
of
linguis4cs
paQerns
using
text
mining
techniques
• Applica4on
to
the
rest
of
GO
MF
and
the
other
GO
ontologies
• Alignment
with
efforts
of
the
GO
Consor4um
22. Acknowledgements
Thanks
for
your
aQen4on!
Jesualdo
Tomás
Fernández
Breis
jfernand@um.es
hQp://webs.um.es/jfernand