Your SlideShare is downloading. ×
0
Reasoning with the RNA Ontology
                    Chris Mungall


        Lawrence Berkeley National Laboratory




    ...
What is a reasoner?
A reasoner implements a generalized decision procedure
which takes a collection of logical axioms and ...
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure




                               Re...
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate...
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate...
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate...
The reasoning square

           Classifying        Validation


                                 Finding
              In...
The reasoning square

           Classifying              Validation
                                            disjoint
...
Ontology Languages
First Order Logic (Common Logic ISO standard)
   Highly Expressive
   Undecidable : No tractable decisi...
Common Logic
Common Logic is an ISO specification for First Order Logic
(FOL)
   Syntaxes
     CLIF - Lisp-like (derived fr...
Common Logic Examples

Textbook syntax                     CLIF

                                      (forall (x)
       ...
Reasoning with FOL
Undecidable.
FOL Theorem provers are not guaranteed to terminate
The Horn logic subset has desirable co...
OWL-DL
OWL belongs to a family of logic known as Description
Logics, circumscribed subsets of FOL that are guaranteed
to b...
Descriptions in OWL
A Description is a (possibly recursive) tree structure that
formally identifies membership criteria for...
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law




...
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law
Wide...
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law
Wide...
No Unique Name Assumption
Classes and instances are potentially equivalent unless
declared otherwise. Given ontology axiom...
The Open World Assumption
Unstated facts are not assumed to be false. Given ontology
axioms
A SubClassOf Base
UnpairedBase...
OBO
Initially an ad-hoc format for the Gene Ontology
    Graph-centric
    Terminological features
Formal Semantics
   Ini...
Reasoning over OBO ontologies
Strategies
   convert to OWL and use an OWL reasoner
   convert to CL and use a FOL theorem ...
Are Description Logics enough?
Some things that cannot be done in OWL-2:
   Define relations using arithmetic:
   Define rel...
Arithmetic in relations
We cannot express this in OWL:

             upstreamOf (x, y) ← end(x) < start(y)

In OWL we must...
Relation Boolean Constructs
We cannot express this in OWL:

    overlaps = ends.af ter.startOf ∩ starts.bef ore.endOf

   ...
N-ary relations and time
In OWL, all relations must be binary. N-ary relations are
useful for reasoning about change.
   A...
Cyclic descriptions
OWL Descriptions are tree-like. Cyclic descriptions are
required for RNA Structures. Proposed def of G...
Tree-like classification structure

                            GNRA TetraloopMotif = hasPart so
                          ...
Tree-like classification structure

                                GNRA TetraloopMotif = hasPart so
                      ...
Tree-like classification structure

                                        GNRA TetraloopMotif = hasPart so
              ...
Labeled sub-descriptions
We would like to do something like this, if it were possible in
OWL:
GNRATetraloopMotif =
  hasPa...
Rules
SWRL (Semantic Web Rule Language) extends OWL with
rules. We can add this to the ontology:
nucleotide(?b0),
g(?b1),
...
Is SWRL the answer?
Bonus: Can be extended with arithmetic operators (to
define upstreamOf)
Negative: only binary relations...
Description Graphs
An extension of OWL to allow representation of cyclic
structures[?].
   Possibly part of OWL3?
   Imple...
OBO Graphs
Cyclic structures can be described in OBO, the graph is
translated to simple rules. These rules can be executed...
OBO Graphs




             Reasoning with the RNA Ontology – p.27/28
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
...
Conclusions
There is no one single ideal subset of FOL for
reasoning
   All subsets have limitations.
   DLs cannot expres...
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
...
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
...
Upcoming SlideShare
Loading in...5
×

Reasoning with RNA

942

Published on

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
942
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
26
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Reasoning with RNA"

  1. 1. Reasoning with the RNA Ontology Chris Mungall Lawrence Berkeley National Laboratory Reasoning with the RNA Ontology – p.1/28
  2. 2. What is a reasoner? A reasoner implements a generalized decision procedure which takes a collection of logical axioms and finds the entailments of these axioms and whether or not the axioms are satisfiable An ontology can be considered as a collection of axioms (in contrast to a terminology) 1. Relationships: is a (SubClass), partOf, ... 2. Definitions 3. Constraints We can also treat data as collections of axioms Reasoning with the RNA Ontology – p.2/28
  3. 3. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure Reasoning with the RNA Ontology – p.3/28
  4. 4. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Reasoning with the RNA Ontology – p.3/28
  5. 5. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Set theoretic: GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure Reasoning with the RNA Ontology – p.3/28
  6. 6. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Set theoretic: GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure Entailment: GNRATetraloop is a RNAStructure Reasoning with the RNA Ontology – p.3/28
  7. 7. The reasoning square Classifying Validation Finding Inference inconsistent Ontology of unstated axioms in relationships in the ontology the ontology Determining Inference if a dataset Data of unstated is valid facts in data Reasoning with the RNA Ontology – p.4/28
  8. 8. The reasoning square Classifying Validation disjoint N N N Tetraloop N N N Purine Pyramidine Ontology N A R A X GNRA Tetraloop N G N GNRA Tetraloop X Data G A A T therm 23SRNA G A G C G A region C G A Reasoning with the RNA Ontology – p.5/28
  9. 9. Ontology Languages First Order Logic (Common Logic ISO standard) Highly Expressive Undecidable : No tractable decision procedures OWL and Description Logics Restricted subset of FOL with highly convenient constructs for describing classes Reasoners are heavily tested on existing ontologies OBO Initially an ad-hoc format for the Gene Ontology Now an alternate syntax for Common Logic Reasoners based on rule application Reasoning with the RNA Ontology – p.6/28
  10. 10. Common Logic Common Logic is an ISO specification for First Order Logic (FOL) Syntaxes CLIF - Lisp-like (derived from KIF) XCL - XML CG - Conceptual Graphs A CL text consists of CL sentences (axioms) Sentences can be atomic, boolean or logically quantified Atomic sentence: a predicate followed by zero or more arguments Boolean sentence: and, or, if ( → ), iff ( ↔ ) Quantified sentence: forall (∀), exists(∃) Reasoning with the RNA Ontology – p.7/28
  11. 11. Common Logic Examples Textbook syntax CLIF (forall (x) (if (GNRATetraloop x) ∀x : GNRATetraloop(x) → T etraloop(x) (Tetraloop x))) (forall (x) (if (Purine x) ∀x : Purine(x) → ¬P yramidine(x) (not Pyramidine x))) (forall (x) ∀x : Intron(x) → ∃yExon(y) ∧ adjacent_to(x, y) (Intron x) (if (exists (y) (and (Exon y) (adjacentTo x y))))) Reasoning with the RNA Ontology – p.8/28
  12. 12. Reasoning with FOL Undecidable. FOL Theorem provers are not guaranteed to terminate The Horn logic subset has desirable computational properties Head ← Body Logic Programming SWRL Datalog Relational Model, Relational Algebra non-monotonic and probabilistic extensions Reasoning with the RNA Ontology – p.9/28
  13. 13. OWL-DL OWL belongs to a family of logic known as Description Logics, circumscribed subsets of FOL that are guaranteed to be decidable Variety of notations (syntaxes): RDF-XML - Default, but it’s a mess OWL-XML - Easier to manipulate computationally Manchester Syntax - Easy on the eye Constructs Property (relation) unary predicates: Functional, Transitive, Symmetric, ... Class Axioms: SubClass, EquivalentClass, DisjointWith, ... Descriptions OWL2 has lots of tool and reasoners to choose from Reasoning with the RNA Ontology – p.10/28
  14. 14. Descriptions in OWL A Description is a (possibly recursive) tree structure that formally identifies membership criteria for a class. Can be combined using logical connectives: AND, OR, NOT AND : intersectionOf OR : unionOf NOT : complementOf Restrictions Restrict class membership based on some property ONLY : example (paired with CWWONLY Guanine) SOME : Quantified cardinality restrictions Example: CWWAGBasePair = hasPart only (A and pair- Reasoning with the RNA Ontology – p.11/28
  15. 15. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Reasoning with the RNA Ontology – p.12/28
  16. 16. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Widely used and tested on ontologies Many reasoners can now classify the larger biological ontologies in acceptable time Reasoning with the RNA Ontology – p.12/28
  17. 17. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Widely used and tested on ontologies Many reasoners can now classify the larger biological ontologies in acceptable time Less widely used on data RDF triplestores are commonly used but these lack key OWL constructs. OWLGRES is a promising technology here. Reasoning with the RNA Ontology – p.12/28
  18. 18. No Unique Name Assumption Classes and instances are potentially equivalent unless declared otherwise. Given ontology axiom: Functional(fivePrimeTo) An instance axioms: A(b1) A(b2) A(b3) b1 fivePrimeTo b2 b1 fivePrimeTo b3 A reasoner will not say this is inconsistent. It will infer that b2=b3. To get a reasoner to detect the inconsistency we must explicitly declare all base instances to be distinct: b1 differentFrom b2 b1 differentFrom b3 b2 differentFrom b3 Reasoning with the RNA Ontology – p.13/28
  19. 19. The Open World Assumption Unstated facts are not assumed to be false. Given ontology axioms A SubClassOf Base UnpairedBase equivalentTo some (Base that pairedWith 0 Base) An instance axioms: A(b1) A(b2) A(b3) b1 fivePrimeTo b2 b2 fivePrimeTo b3 A reasoner will not infer b1, b2 or b3 to be UnpairedBases. We need to explicitly declare this: UnpairedBase(b1) UnpairedBase(b2) Reasoning with the RNA Ontology – p.14/28
  20. 20. OBO Initially an ad-hoc format for the Gene Ontology Graph-centric Terminological features Formal Semantics Initially lacked formal semantics. Formal definition written in natural language in Relations Ontology. Translation to OWL-DL (Horrocks et al) With OBO 1.3, every OBO document is a Common Logic Text OBO-Core consists only of atomic sentences OBO-CL allows arbitrary logical formulae OBO-H OBO-Core plus horn rules Reasoning with the RNA Ontology – p.15/28
  21. 21. Reasoning over OBO ontologies Strategies convert to OWL and use an OWL reasoner convert to CL and use a FOL theorem prover Use a rule-based reasoner Java implementation: OBO-Edit Prolog implementation: Easy to extend SQL implementation: slow but scales over massive ontologies and datasets Limitations: limited support for negation Reasoning with the RNA Ontology – p.16/28
  22. 22. Are Description Logics enough? Some things that cannot be done in OWL-2: Define relations using arithmetic: Define relations using intersection, union and negation Declare relations with > 2 arguments Makes reasoning about change harder Model cyclic structures Any structure with an acyclic path through some combination of relations (Carbon rings, RNA molecules) Reasoning with the RNA Ontology – p.17/28
  23. 23. Arithmetic in relations We cannot express this in OWL: upstreamOf (x, y) ← end(x) < start(y) In OWL we must: explicitly name all the bases, and declare a 5’ to 3’ connection relation between them declare < as the transitive version of the 5’ to 3’ relation This is feasible with RNA, but not DNA Reasoning with the RNA Ontology – p.18/28
  24. 24. Relation Boolean Constructs We cannot express this in OWL: overlaps = ends.af ter.startOf ∩ starts.bef ore.endOf disconnected = ¬overlaps This severely limits OWL when applied to instance data involving intervals Reasoning with the RNA Ontology – p.19/28
  25. 25. N-ary relations and time In OWL, all relations must be binary. N-ary relations are useful for reasoning about change. As the RNA molecule folds, unpaired bases become paired: ¬paired with CWW(b1, b5, t0) paired with CWW(b1, b5, t1) instance of (b1, UnpairedBase, t0) instance of (b1, PairedBase, t1) There are a variety of (awkward) techniques for translating N-ary relations to binary ¬paired with CWW(b1@t0, b5@t0) paired with CWW(b1@t1, b5@t1) Reasoning with the RNA Ontology – p.20/28
  26. 26. Cyclic descriptions OWL Descriptions are tree-like. Cyclic descriptions are required for RNA Structures. Proposed def of GNRA Tetraloop: GNRA TetraloopMotif = hasPart some ( Nucleobase and fivePrimeTo some (G and fivePrimeTo some (Nucleobase and fivePrimeTo some (Purine and fivePrimeTo some (A and fivePrimeTo some (Nucleobase and pairsWithCWW som and pairsWithTHS some G))) and pairsWithTSH some A) and pairsWithCWW some Nucleobase) Reasoning with the RNA Ontology – p.21/28
  27. 27. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N Reasoning with the RNA Ontology – p.22/28
  28. 28. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N C G A G A A Reasoning with the RNA Ontology – p.22/28
  29. 29. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N C G A A A G G A A A G C Reasoning with the RNA Ontology – p.22/28
  30. 30. Labeled sub-descriptions We would like to do something like this, if it were possible in OWL: GNRATetraloopMotif = hasPart some (Nucleobase[1] and fivePrimeTo some (G[2] and fivePrimeTo some (Nucleobase[3] and fivePrimeTo some (Nucleobase[4] and fivePrimeTo some (A[5] and fivePrimeTo some (Nucleobase[6] and pairsWithCW and pairsWithTHS some G[2]))) and pairsWithTSH some A[5]) and pairsWithCWW some Nucleobase[6]) Reasoning with the RNA Ontology – p.23/28
  31. 31. Rules SWRL (Semantic Web Rule Language) extends OWL with rules. We can add this to the ontology: nucleotide(?b0), g(?b1), nucleotide(?b2), purine(?b3), a(?b4), nucleotide(?b5), followedBy(?b0, ?b1), followedBy(?b1, ?b2), followedBy(?b2, ?b3), followedBy(?b3, ?b4), followedBy(?b4, ?b5), pairedWithTHS(?b4, ?b1), pairedWithCWW(?b5, ?b0) --> partOfGNRATetraloop(?b0) Reasoning with the RNA Ontology – p.24/28
  32. 32. Is SWRL the answer? Bonus: Can be extended with arithmetic operators (to define upstreamOf) Negative: only binary relations Negative: only instance classification We cannot use the previous definition for ontology classification Negative: we cannot infer the existence of undeclared entities We can tell a base is part of a tetraloop motif, but we can’t infer the tetraloop motif instance Reasoning with the RNA Ontology – p.25/28
  33. 33. Description Graphs An extension of OWL to allow representation of cyclic structures[?]. Possibly part of OWL3? Implemented in HermiT reasoner Largely new and untested Reasoning with the RNA Ontology – p.26/28
  34. 34. OBO Graphs Cyclic structures can be described in OBO, the graph is translated to simple rules. These rules can be executed us- ing LP or even SQL. Reasoning with the RNA Ontology – p.27/28
  35. 35. OBO Graphs Reasoning with the RNA Ontology – p.27/28
  36. 36. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  37. 37. Conclusions There is no one single ideal subset of FOL for reasoning All subsets have limitations. DLs cannot express a lot of what we need for primary and secondary sequence structures The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  38. 38. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs An incorrect formally specified definition is worse than a correct informally specified definition Hybrid reasoning approaches are feasible The basic instance classification problem is just not that hard (compared to RNA bioinformatics as a whole) Special purpose algorithms will probably beat general purpose reasoners But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  39. 39. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Perhaps its too early to worry too much about reasoning Priority: simple term lists, basic isa hierarchy, with definitions written for humans, plus motif definitions in some compact notation Reasoning with the RNA Ontology – p.28/28
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×