Published on

Ontologies, OWL and Protégé course from the semantic technololgies day at the XML Summer School in Oxford 2009.

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Ontologies, OWL and Prot égé Duncan Hull The University of Manchester, UK http://www. manchester .ac. uk Semantic Technologies Tuesday 22nd September 2009
  2. 2. Learning Objectives <ul><li>Understand some of the Web Ontology Language (OWL) and it’s explicit semantics </li></ul><ul><li>Learn some of the principles of modelling using Description Logic ontologies and reasoning </li></ul><ul><li>Gain hands-on introductory experience with ontology development using Protégé-OWL tools </li></ul><ul><li>Learn how to take advantage of inferencing capabilites to build robust, reusable models </li></ul><ul><li>Review where OWL fits in with related technology and why you might want to use it This tutorial normally takes 1-2 days (10 exercises), we can only scratch the surface of OWL and Protégé in a 90 minute session (~2 exercises) http://bit.ly/owl-tutorial 11th/12th November, MAN </li></ul>Short summary of the class
  3. 3. Course Contents <ul><li>Why would you want to use OWL? </li></ul><ul><ul><li>- The Web Ontology Language </li></ul></ul><ul><li>What is an OWL ontology? </li></ul><ul><li>Where does OWL fit with related technology? </li></ul><ul><ul><li>- W3C standards: XML, RDF and SPARQL </li></ul></ul><ul><ul><li>- Relational Databases </li></ul></ul><ul><ul><li>- Linked Data </li></ul></ul><ul><li>Who is responsible for OWL? </li></ul><ul><li>When was OWL created? </li></ul><ul><li>How can you use OWL? </li></ul><ul><li>- Tutorial Session </li></ul><ul><li>Minor modifications to slides: http://www.slideshare.net/dullhunk </li></ul>
  4. 4. Where I’m coming from… <ul><li>Trained as a Biologist </li></ul><ul><li>Scientific, technical publishing and software engineering </li></ul><ul><li>PhD Computer Science (OWL+Web Services) 2007 </li></ul><ul><li>Now integrating and mining public biochemical data on the web </li></ul>Tamiflu ChEBI:7799 http://www.sbml.org It’s all John’s fault!
  5. 5. Why would you want to use OWL?
  6. 6. Why? <ul><li>Your data is important and you’re prepared to invest resources to precisely define the meaning in a way that computers can “understand” and infer additional information </li></ul>Image via http://www.flickr.com/photos/dullhunk/639163558/
  7. 7. Why would you bother doing that? <ul><li>1. Semantic Integration of Big Data </li></ul><ul><li>“ The Web is Agreement” </li></ul><ul><li>2. Better Search and Querying </li></ul><ul><li>“ Google is great but…” </li></ul><ul><li>3. Artificial Intelligence (A.I.) </li></ul><ul><li>“ A more knowledgeable web…” </li></ul><ul><li>4. Some examples… </li></ul><ul><li>5. Standardisation </li></ul>
  8. 8. Why? No. 1 Semantic Integration <ul><li>Big Data: Lots of scenarios involve integrating data from multiple different sources: </li></ul><ul><li>In some cases, data integration is easier and quicker where semantics are agreed in advance (rather than cleaning it up afterwards) </li></ul><ul><li>CC-image via http://en.wikipedia.org/wiki/File:Datawarehouse.png </li></ul>stuff
  9. 9. Why? The Web is Agreement <ul><li>http://thewebisagreement.com/ </li></ul><ul><li>OWL ontologies can be used to express agreement about the meaning of data on the Web </li></ul><ul><li>Between: Human-human, human-machine, and machine-machine </li></ul>Paul Downey
  10. 10. Why? No. 2 Better search <ul><li> is great but search engines could be much better , for example: </li></ul><ul><li>Complex queries involving background knowledge: </li></ul><ul><li>“ Find information about animals that use sonar but are neither bats or dolphins” (answer: barn owl) </li></ul><ul><li>Finding and using web services: </li></ul><ul><li>“ Book me a holiday next weekend somewhere warm, not too far away and where they speak french or english” </li></ul>Usually impossible to do using the syntactic web search, need semantic search Image via http://en.wikipedia.org/wiki/File:Tyto_alba_close_up.jpg
  11. 11. Why? No. 3 A.I. <ul><li>Artificial Intelligence was/is a key </li></ul><ul><li>motivation behind the semantic web </li></ul><ul><li>E.g. “A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities” - Tim Berners-Lee et al , Scientific American, 2001 </li></ul>
  12. 12. Why? no. 3: A.I. <ul><li>2001: A Semantic Odyssey? </li></ul><ul><li>Realising the complete vision is (probably) too hard for now but we can make a start… </li></ul><ul><li>People have already started to build semantic web s (plural) rather than a monolithic semantic web, lets have a look at some of them… </li></ul>HAL 9000 Semantic Web? I'm sorry, Dave. I'm afraid I can't do that.
  13. 13. Why? no. 4: Some examples: <ul><li>Large biomedical terminologies using OWL </li></ul><ul><li>National Cancer Institute Thesaurus http://cancer.gov The NCI Thesaurus provides definitions, synonyms, and other information on nearly 10,000 cancers and related diseases </li></ul><ul><li>Contains 50,000 concepts managed by up to 20 people, provides terminology for applications like the cancer image database. </li></ul>
  14. 14. Why? no. 4. Some examples <ul><li>“ SNOMED CT® is a clinical terminology - the Systematised Nomenclature of Medicine Clinical Terms. It is a common computerised language that will be used by all computers in the NHS to facilitate communications between healthcare professionals in clear and unambiguous terms.” </li></ul><ul><li>373,731 classes and over 1 million terms </li></ul><ul><li>NHS version extended to 542,380 classes </li></ul><ul><li>Large ontology classified in < 4 hours </li></ul><ul><li>Reasoner finds inconsistencies: e.g. 180 missing subclasses </li></ul><ul><li>Periocular_dermatitis subClassOf Disease_of_face </li></ul>
  15. 15. Why? 4. More examples <ul><li>Pharmaceutical, biotechnology, drug discovery etc… </li></ul><ul><li>ChEBI: Ch emical E ntities of B iological I nterest </li></ul><ul><li>http://www.ebi.ac.uk/chebi </li></ul><ul><li>“ freely available dictionary of ‘small’ chemical compounds” (e.g. many drugs) using OWL </li></ul><ul><li>Currently contains ~500,000 small molecules, OWL is used to automate curation of the database and check quality </li></ul>
  16. 16. Why no.5 : Standardisation <ul><li>There are plenty of different ontology languages: </li></ul><ul><li>… OWL is the only one that is a W3C standard… </li></ul><ul><li>Large and active community of developers and users around the world </li></ul><ul><li>Choice of tools to handle OWL </li></ul><ul><li>Interoperability </li></ul><ul><li>etc </li></ul>
  17. 17. Why? Summary <ul><li>Biomedical applications </li></ul><ul><ul><li>Healthcare and Life Sciences </li></ul></ul><ul><ul><li>Lots of terminology </li></ul></ul><ul><li>Big data </li></ul><ul><ul><li>Gigabytes / Terabytes of data </li></ul></ul><ul><ul><li>Manual curation not possible </li></ul></ul><ul><li>Scientific applications (W3C HCLSIG) </li></ul><ul><ul><li>Precision and Accuracy are important </li></ul></ul><ul><ul><li>http://www.w3.org/2001/sw/hcls/ </li></ul></ul>“ Biology is just naming things”
  18. 18. What is an OWL ontology?
  19. 19. What is an OWL ontology? <ul><li>“An ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain.” </li></ul><ul><li>http://en.wikipedia.org/wiki/Ontology_(information_science) </li></ul>
  20. 20. What? Pizza ontology <ul><li>We’re going to use Pizzas in this tutorial… </li></ul><ul><li>We could use more realistic examples but they require a specialist knowledge of: </li></ul><ul><li>Biochemistry </li></ul><ul><li>Cancer </li></ul><ul><li>Medicine </li></ul><ul><li>etc </li></ul><ul><li>… Whereas we are all “experts” on Pizza, Pizzas are the “Hello world” of ontologies </li></ul><ul><li>http://www.co-ode.org/ontologies/pizza/ </li></ul>Pizza from http://www.flickr.com/photos/roadsidepictures/1544645159/
  21. 21. Pizza Margherita Pizza Vegetarian Pizza Spicy Beef Pizza What? A simple pizza ontology hasTopping (Object property) hasBase (Object property) subclassOf Pizza (A class in asserted hierarchy) Pizza Topping Vegetable topping Tomato topping Mozzarella topping Cheese topping Pizza_base Deep dish base Regular base
  22. 22. What? Object properties <ul><li>Things you can say about properties: some , only , min , max and exactly </li></ul><ul><li>some means at least one of the toppings is a CheeseTopping </li></ul><ul><li>only means all of the toppings are CheeseTopping </li></ul><ul><li>min , max and exactly are self-explanatory </li></ul>
  23. 23. What? Object properties <ul><li>More things you can say about properties </li></ul><ul><li>Symmetric e.g. touches (or spouse ) </li></ul><ul><li>PizzaTopping touches PizzaBase </li></ul><ul><li>implies </li></ul><ul><li>PizzaBase touches PizzaTopping </li></ul><ul><li>Transitive e.g. subClassOf </li></ul><ul><li>Pizza subClassOf Food </li></ul><ul><li>CheeseyPizza subClassOf Pizza </li></ul><ul><li>implies </li></ul><ul><li>CheeseyPizza subClassOf Food </li></ul><ul><li>These are important for reasoning </li></ul>
  24. 24. What? Structure of ontology <ul><li>Classes and properties: Terminology (TBox) Instances: Assertions (ABox) </li></ul><ul><li>TBox is a similar to a database schema e.g. </li></ul><ul><li>Pizza hasBase PizzaBase </li></ul><ul><li>VegetarianPizza hasTopping Vegetables </li></ul><ul><li>e.t.c. </li></ul><ul><li>Abox is similar to data (instances) in a database </li></ul><ul><li>ThisPizza is-an-instance-of CajunPizza </li></ul><ul><li>America is-an-instance-of Country </li></ul><ul><li>Fred is-an-instance-of DogLover </li></ul><ul><li>ABox + TBox combined called a “knowledgebase” </li></ul>
  25. 25. What? Logic and Reasoning <ul><li>A key feature of OWL is reasoning (aka classification) , with a Description Logic (DL) reasoner (a bit like a source code compiler). </li></ul><ul><li>There are four basic tasks a reasoner can perform: </li></ul><ul><li>Subsumption : check that knowledge is correct </li></ul><ul><li>Equivalence : check for minimal redundancy </li></ul><ul><li>Consistency : check for contradictions </li></ul><ul><li>Instantiation : is a an instance of b ? </li></ul><ul><li>The reasoner infers new information from your asserted class hierarchy and builds a new inferred class hierarchy based on your definitions </li></ul><ul><li>Automates classification that might otherwise be done manually </li></ul>
  26. 26. What? OWL Subsumption <ul><li>Check knowledge is “correct” </li></ul><ul><li>E.g. Fiorentina should be a subclass of VegetarianPizza? </li></ul><ul><li>If inferred hierarchy is inconsistent with intuition then this indicates an error </li></ul><ul><li>(in your model) </li></ul>
  27. 27. What? OWL equivalence <ul><li>Similar to subsumption, a reasoner will tell you when two classes are equivalent e.g. </li></ul><ul><li>BoringPizza is equivalent to a MargheritaPizza </li></ul><ul><li>PizzaTopping is equivalent to PizzaBase ? </li></ul>
  28. 28. What? OWL Consistency <ul><li>Consistency: check that no contradictory statements have been made: in Protégé these are highlighted in red </li></ul><ul><li>Cheese and Vegetable are disjoint classes (can’t be both) </li></ul><ul><li>CheeseyVegetable is a subclass of Cheese </li></ul><ul><li>CheeseyVegetable is a subclass of Vegetable </li></ul>
  29. 29. What? OWL instantiation <ul><li>Check for instances of a class </li></ul><ul><li>E.g. Show me all the instances of CheeseyPizza </li></ul><ul><li>Important for querying (not covered here) </li></ul>
  30. 30. Where does OWL fit with related technology?
  31. 31. Where does OWL fit? <ul><li>“An ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain.” </li></ul><ul><li>http://en.wikipedia.org/wiki/Ontology_(information_science) </li></ul><ul><li>Sounds a little bit like: </li></ul><ul><li>RDF and RDF Schema </li></ul><ul><li>Relational Databases </li></ul><ul><li>XML Schema </li></ul><ul><li>etc? </li></ul><ul><li>also Linked data </li></ul>
  32. 32. Where does OWL fit? <ul><li>The Semantic Web: Will it all end in tiers? </li></ul>Unicode + URIs + namespaces RDF/XML OWL/XML e.t.c. OWL 2.0 explicit semantics syntaxes OWL builds on top of standards you already know or have just learned about at the XML summer school SPARQL-DL To be done SPARQL Previous tutorial
  33. 33. Where? Relational databases <ul><li>Some key differences between OWL and Relational Databases (DBMS) </li></ul><ul><li>Open World Semantics </li></ul><ul><li>Rejecting updates </li></ul><ul><li>Use of schema to answer queries </li></ul><ul><li>There are more differences, see Reference [6] “Ontologies and the semantic web” at the end for more details </li></ul>
  34. 34. Where? Open World <ul><li>Open World Assumption: missing information is treated as unknown rather than false </li></ul><ul><li>c.f. Databases which make closed world assumption </li></ul>In a social networking website, missing information (who all your friends are) is often treated as false: e.g. “ You have NO friends ” (loser!) This is a subtle but important distinction On http://network.nature.com/people/duncan
  35. 35. Where? Rejecting updates <ul><li>Unlike DBMS, ontology tools typically don't reject updates that result in the ontology becoming inconsistent, they just warn. </li></ul><ul><li>You’ll see this in the exercises… </li></ul>
  36. 36. Where? Query answering <ul><li>In OWL, the schema plays a much more important role and is actively considered at query time (but discarded with DBMS) - this makes it possible to answer conceptual queries e.g. </li></ul>Pizza from http://www.flickr.com/photos/roadsidepictures/1544645159/ Is any Pizza that hasTopping Cheese necessarily a CheeseyPizza?
  37. 37. Where? OWL and Linked Data <ul><li>Use URIs to identify things that you expose to the Web as resources. YES, everything important has a URI </li></ul><ul><li>Use HTTP URIs so that people can locate and look up (dereference) these things. YES, but don’t have to </li></ul><ul><li>Provide useful information about the resource when its URI is dereferenced. YES, but don’t have to </li></ul><ul><li>Include links to other, related URIs in the exposed data as a means of improving information discovery on the Web. YES, but again this is optional </li></ul>Returning to the linked data session… http://www.co-ode.org/ontologies/pizza/ for an example of owl and linked data
  38. 38. Who is responsible for OWL?
  39. 39. Who? <ul><li>OWL is managed by a Working Group at the W3C </li></ul><ul><li>http://www.w3.org/2007/OWL/ </li></ul><ul><li>A large group of people chaired by : </li></ul>http://web.comlab.ox.ac.uk/ian.horrocks/ http://sciencecommons.org/about/whoweare/ruttenberg/ Ian Horrocks, The University of Oxford Alan Ruttenberg, Science Commons
  40. 40. Who? <ul><li>Every year developers and users of OWL gather at OWLED ( OWL E xperiences and D irections) </li></ul><ul><li> http://www.webont.org/owled/ </li></ul><ul><li>5th International Workshop on 23-24th October 2008, Chantilly, Virginia, USA </li></ul><ul><li>Co-located with the 8th International Semantic Web Conference (ISWC) 25-29th October http://iswc2009.semanticweb.org/ Washington, DC, USA. </li></ul>
  41. 41. When was OWL created?
  42. 42. When? <ul><li>OWL 1.0 a recommendation in 2004 </li></ul><ul><li>http://www.w3.org/2004/OWL/ </li></ul><ul><li>http://www.w3.org/TR/owl-semantics/ </li></ul><ul><li>OWL 2.0 a candidate recommendation in 2009 </li></ul><ul><li>http://www.w3.org/TR/owl2-profiles/ </li></ul><ul><li>See “OWL 2.0: The next step for OWL” in the references at the end… </li></ul><ul><li>Compare that to XML 1.0 which was a recommendation in 1998… </li></ul>
  43. 43. When? <ul><li>But ontologies generally are much older than that… </li></ul>CC picture from http://en.wikipedia.org/wiki/File:Sanzio_01_Plato_Aristotle.jpg A Aristotle Οντολογία <ul><li>Linguistics </li></ul><ul><li>Natural Language Processing (NLP) </li></ul><ul><li>Philosophy </li></ul><ul><li>Data mining </li></ul><ul><li>Text mining </li></ul>
  44. 44. How can you use OWL?
  45. 45. How? Protégé <ul><li>Protégé is a free, Open Source ontology editor </li></ul><ul><li>http://protege.stanford.edu/ </li></ul><ul><li>http: //protege . stanford .edu/download/protege/4.0/installanywhere/ </li></ul><ul><li>Protégé research & development has been led by </li></ul><ul><li>Professor Mark Musen </li></ul><ul><li>Stanford University, USA </li></ul><ul><li>Professor Alan Rector </li></ul><ul><li>University of Manchester, UK </li></ul><ul><li>Protégé supports latest version of OWL (OWL 2.0) and uses the OWL-API http://owlapi.sourceforge.net/ </li></ul>Mark Musen Alan Rector
  46. 46. How? Hands-on tutorial <ul><li>See tutorial slides at the end </li></ul><ul><li>Don’t worry if you can’t complete all the exercises, there is one exercise too many, just in case. </li></ul>
  47. 47. Acknowledgements <ul><li>John Chelsom and Lauren Wood </li></ul><ul><li>Information Management Group (IMG) and Bio-Health Informatics Group (BHIG) at The University of Manchester: Alan Rector, Matthew Horridge, Simon Jupp, Nick Drummond, Robert Stevens, Holger Knublauch, Georgina Moulton, Chris Wroe, Ulrike Sattler, Ian Horrocks, Bijan Parsia, Sean Bechhofer, Carole Goble and many others </li></ul><ul><li>Currently funded by www.bbsrc.ac.uk as part of REFINE project www.nactem.ac.uk/refine devised by Douglas Kell and Sophia Ananiadou </li></ul><ul><li>substantial parts of this tutorial and slides have been developed by the http://www.co-ode.org/ project with funding from www.jisc.ac.uk </li></ul>
  48. 48. Any questions? Thank you for your attention
  49. 49. How? Protégé tutorial ex. 2 <ul><li>Start Protégé </li></ul><ul><li>Click on “Open OWL Ontology” </li></ul><ul><li>Open the exercise 2 ontology: select “pizza-ex2.owl” from the exercise folder (exercise 1 of building this ontology has been done for you to save time) </li></ul><ul><li>Explore the “asserted class hierarchy” by clicking on the classes in the “classes” tab </li></ul><ul><li>Add some new subclasses by selecting MeatTopping and then pressing “Add Subclass” button (top left button in asserted classes hierarchy) </li></ul>
  50. 50. How? Protégé tutorial, ex. 2 cont. <ul><li>Note that the MeatyVegetableTopping has been asserted to be a subclass of both Meat and Vegetable (see “Superclasses” in the “Description” pane on right hand side. Is this inconsistent?) </li></ul><ul><li>Click on the “Inferred class hierarchy” and note that it should be empty (apart from a single class called “Thing”) </li></ul><ul><li>On the “Reasoner” menu, select a reasoner (there are different reasoners available but “FaCT++” is easiest to use for this exercise) </li></ul><ul><li>Now select “Classify” from the same menu, this will run the reasoner. What is the result? </li></ul><ul><li>Save the result, (note the different available syntaxes for saving ontologies) </li></ul>
  51. 51. How? Protégé tutorial ex. 2 cont. <ul><li>To make sure toppings can not be both meat and vegetable at the same time you need to add disjoint axioms to explicitly state the disjunction. </li></ul><ul><li>Select one of your top level concepts (e.g. Pizza) and press ctrl-J (windows) or cmd-j (mac) to make Pizza disjoint from all its sibling classes </li></ul><ul><li>Note that the “Description” pane lists all classes Pizza is now disjoint with (e.g PizzaBase etc) </li></ul><ul><li>Repeat this for the MeatTopping level of the ontology </li></ul><ul><li>Run the reasoner, is MeatyVegetableTopping now inconsistent as expected? </li></ul>
  52. 52. How? Protégé tutorial ex. 4 <ul><li>Exercise 3 has been skipped, close your current ontology and the solution “pizza-ex3.owl” to start ex. 4 </li></ul><ul><li>Exercise 4 is included here for keen students (and anyone wanting to do some homework) </li></ul><ul><li>In order to describe our classes more fully we need properties which relate members of a class. We can then add restrictions on the class to state how the properties are used. </li></ul><ul><li>At this stage we are creating Primitive Classes , which only have Necessary Conditions . These are conditions that must be satisfied by all members of this class </li></ul><ul><li>Select the “Object Properties” called hasTopping (this is a relation between two classes). </li></ul><ul><li>Back in the classes tab, create a new subClass of Pizza called NamedPizza </li></ul><ul><li>Create a new subClass of NamedPizza called MargheritaPizza </li></ul>
  53. 53. How? Protégé ex. 4 continued <ul><li>Create restrictions on MargheritaPizza: In the “Description” pane under “Superclasses” click on the “+” button to add a restriction </li></ul><ul><li>Type “hasTopping some MozzarellaTopping” (This says that it is necessary condition for a MargheritaPizza, to have at least one Topping that is a MozzarellaTopping) </li></ul><ul><li>Repeat this process to state that this kind of pizza also “hasTopping some TomatoTopping” </li></ul><ul><li>Run the reasoner to check for consistency </li></ul>
  54. 54. References <ul><li>1. Protégé is a free, open source ontology editor and knowledge-base framework that is available from http://protege.standord.edu The version you have been using in this tutorial is Protégé 4.x </li></ul><ul><li>2. The CO-ODE project http://www.co-ode.org has lots more useful material on ontologies. For example, a complete finished version of the Pizza and other ontologies are available from http://www.co-ode.org/ontologies/ </li></ul><ul><li>3. Matthew Horridge (2004) Protégé OWL Tutorial. This is a comprehensive guide to OWL, more complete than this tutorial and available from http://owl.cs.manchester.ac.uk/tutorials/protegeowltutorial/ see some more examples at http://owl.cs.manchester.ac.uk/2009/07/sssw and software http://owl.cs.manchester.ac.uk </li></ul><ul><li>4. Alan Rector, Nick Drummond, Matthew Horridge, Jeremy Rogers, Holger Knublauch, Robert Stevens, Hai Wang, Chris Wroe (2004) OWL Pizzas: Practical Experience of Teaching OWL-DL: Common Errors and Common Patterns In Proc. of European Conference on Knowledge Acquistion (EKAW'04), Vol. 3257 (2004), pp. 63-81. http://www.co-ode.org/resources/papers/ekaw2004.pdf gives an overview of common errors and pitfalls (with solutions) to building ontologies in OWL using pizzas as an example </li></ul><ul><li>5. Ian Horrocks (2003) From SHIQ and RDF to OWL: the making of a Web Ontology Language Journal of Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 1, No. 1. (December 2003), pp. 7-26. (this paper gives a readable overview of the relationship between RDF and OWL with some history on the development and integration of the two languages) A free version of this paper is available from http://www.comlab.ox.ac.uk/people/ian.horrocks/Publications/download/2003/HoPH03a.pdf </li></ul><ul><li>Ian Horrocks (2008) Ontologies and the semantic web. Commun. ACM, Vol. 51, No. 12, pp. 58-67. http://www.comlab.ox.ac.uk/people/ian.horrocks/Publications/download/2008/Horr08a.pdf gives an nice overview of the differences between OWL and relational databases </li></ul><ul><li>7. These and other papers relating to OWL, Ontologies and Protégé are available in citeulike tagged as “xml summer school” at http://www.citeulike.org/tag/xml-summer-school </li></ul>