2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

1,322 views

Published on

Contains a small background on the semantic web, and shows how Prolog is thought to be used from inside Bioclipse research software for RDF data handling.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,322
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

  1. 1. nd 2 Status report of degree project Integrating Blipkit/BioProlog for semantic reasoning in Bioclipse Samuel Lampa, 2010-01-25 Project blog: http://saml.rilspace.com
  2. 2. Some background...
  3. 3. What is “Semantic Web”?
  4. 4. What is Semantic Web? “Enabling more powerful use of information” Main goals: ● Data availability (on the web) ● Machine-readability of data ● Knowledge integration ● Automatic “conclusion drawing” ● “Reasoning”, using Reasoners →
  5. 5. This project compares two reasoners: Pellet and Blipkit
  6. 6. Research question
  7. 7. Research question How do biochemical questions formulated as Prolog queries compare to other solutions available in Bioclipse in terms of speed and expressiveness?
  8. 8. Semantic Reasoners ● Pellet/Jena ● Uses W3C languages – OWL (Class definitions) – RDF (Facts) – SPARQL (Querying) ● Blipkit/BioProlog ● Uses Prolog, with W3C languages “on top” – Class definitions, Facts and Queries either in W3C languages (“on top” of prolog) or in pure Prolog!
  9. 9. What is Prolog?
  10. 10. What is Prolog? ● State facts and rules ● Execute by running queries over these facts and rules ● Unique features: ● Backtracking ● “Closed-world assumption”
  11. 11. Prolog code example
  12. 12. Prolog code example % === SOME FACTS === hasHBondDonors( substanceX, 3 ). % “substance X has 3 H-bond donors” % etc … % === A RULE ("RULE OF FIVE" ÀLA PROLOG) === isDrugLike( Substance ) :- hasHBondDonorsCount( Substance, HBDonors ), HBDonors <= 5, hasHBondAcceptorsCount( Substance, HBAcceptors ), HBAcceptors <= 10, hasMolecularWeight( Substance, MW ), MW < 500. % === QUERYING THE RULE === ?- isDrugLike(substanceX) true. ?- isDrugLike(X) X = substanceX ; X = substanceY.
  13. 13. Prolog code example % === SOME FACTS === hasHBondDonors( substanceX, 3 ). % “substance X has 3 H-bond donors” % etc … % === A RULE ("RULE OF FIVE" ÀLA PROLOG) === Head Implication (“If [body] then [head]”) isDrugLike( Substance ) :- hasHBondDonorsCount( Substance, HBDonors ), HBDonors <= 5, hasHBondAcceptorsCount( Substance, HBAcceptors ), HBAcceptors <= 10, hasMolecularWeight( Substance, MW ), MW < 500. Body % === QUERYING THE RULE === ?- isDrugLike(substanceX) Comma means conjunction (“and”) true. ?- isDrugLike(X) X = substanceX ; X = substanceY. Capitalized terms are always variables
  14. 14. Prolog code example % === SOME FACTS === hasHBondDonors( substanceX, 3 ). % “substance X has 3 H-bond donors” % etc … % === A RULE ("RULE OF FIVE" ÀLA PROLOG) === isDrugLike( Substance ) :- hasHBondDonorsCount( Substance, HBDonors ), HBDonors <= 5, hasHBondAcceptorsCount( Substance, HBAcceptors ), HBAcceptors <= 10, hasMolecularWeight( Substance, MW ), MW < 500. % === QUERYING THE RULE === ?- isDrugLike(substanceX) Testing a specific atom (“sutstanceX”) true. ?- isDrugLike(X) X = substanceX ; By submitting a variable (“X”), it will be populated with all instances which satisfies the “isDrugLike” rule X = substanceY.
  15. 15. Where are we now?
  16. 16. Project plan
  17. 17. What is done so far?
  18. 18. What is done so far? ● Integration of Blipkit in Bioclipse ● Done: General purpose methods ● Done: Found usage strategy for combined use of Bioclipse JS scripting and Prolog ● Comparing Prolog and Pellet ● Done: Simple performance testing ● Now: Stuck on NMR spectrum similarity search – (No backtracking on arithmetic operators in SPARQL)
  19. 19. What is left?
  20. 20. What remains to be done? ● Integration of Prolog / Blipkit ● Refinements? ● Comparing Prolog and Pellet ● NMR spectrum similarity search – Investigate use of OWL in querying – Other options? SWRL? ● ChEMBL data ● Toxicity data (opentox.org)
  21. 21. Example Bioclipse / Prolog script
  22. 22. Example Bioclipse/Prolog script blipkit.init(); blipkit.loadRDFToProlog("nmrshiftdata.100.rdf.xml"); // Define a “convenience prolog method” blipkit.loadPrologCode(" hasPeak( Subject, Predicate ) :- rdf_db:rdf( Subject, 'http://www.nmrshiftdb.org/onto#hasPeak', Predicate ). "); // Call the convenience method (which in turn executes it's // “body”), and returns all mathing results as an array var resultList = blipkit.queryProlog(["hasPeak","10","Subject","Predicate"]);
  23. 23. Example Bioclipse/Prolog script blipkit.init(); blipkit.loadRDFToProlog("nmrshiftdata.100.rdf.xml"); // Define a “convenience prolog method” blipkit.loadPrologCode(" hasPeak( Subject, Predicate ) :- rdf_db:rdf( Subject, 'http://www.nmrshiftdb.org/onto#hasPeak', Predicate ). "); Prolog rule to load into prolog engine // Call the convenience method (which in turn executes it's // “body”), and returns all mathing results as an array var resultList = blipkit.queryProlog(["hasPeak","10","Subject","Predicate"]); Prolog method to call Limit the number of results Prolog variables
  24. 24. Current status of research question
  25. 25. Current status of research question ● Performance ● Prolog won so far. Exceptions? ● Usability ● Prolog very convenient for iterative wrapping of complex logic. Can RDF/OWL/SPARQL replicate this? ● Where do RDF/OWL/SPARQL excel?
  26. 26. Project plan
  27. 27. Thank you! Project blog: http://saml.rilspace.com
  28. 28. Project plan – Current version
  29. 29. Project plan – Proposed version

×