Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An introduction to topic maps,ontologies and published subjects


Published on

  • Be the first to comment

  • Be the first to like this

An introduction to topic maps,ontologies and published subjects

  1. 1. The TAO of Topic Maps An Introduction to Topic Maps, Ontologies, and Published Subjects Steve Pepper, CEO, Ontopia Convenor ISO/IEC JTC 1/SC 34/WG 3 Editor XML Topic Maps <>
  2. 2. Who am I? <ul><li>Steve Pepper </li></ul><ul><ul><li>Norway’s Head of Delegation to ISO SC34 </li></ul></ul><ul><ul><li>Convenor of ISO/IEC JTC 1/SC 34/WG 3 (Information Association) </li></ul></ul><ul><ul><li>Editor of XML Topic Maps 1.0 specification (XTM) </li></ul></ul><ul><ul><li>Editor of Topic Map Constraint Language </li></ul></ul><ul><ul><li>Founder and CEO of Ontopia </li></ul></ul><ul><li>Ontopia </li></ul><ul><ul><li>The Topic Map Company </li></ul></ul><ul><ul><li>Specialists in Topic Map Software and Services </li></ul></ul><ul><ul><li>Norwegian company, headquartered in Oslo </li></ul></ul>
  3. 3. What are Topic Maps? <ul><li>An international standard, approved by the ISO </li></ul><ul><li>A form of knowledge representation that is optimized for information management </li></ul><ul><li>A formal data model with an XML interchange syntax </li></ul><ul><li>An indexing and navigation paradigm for humans </li></ul><ul><li>A source of intelligent data for software agents </li></ul><ul><li>A technology for exploiting ontologies </li></ul>
  4. 4. Introducing the Topic Map Model <ul><li>The core concepts of Topic Maps are based on those of the back-of-book index </li></ul><ul><li>The same basic concepts have been extended and generalized for use with digital information </li></ul><ul><li>Envisage a 2-layer data model consisting of </li></ul><ul><ul><li>a set of information resources (below), and </li></ul></ul><ul><ul><li>a “knowledge map” (above) </li></ul></ul><ul><li>This is like the division of a book into content and index </li></ul>knowledge layer information layer (index) (content)
  5. 5. (1) The Information Layer <ul><li>The lower layer contains the content </li></ul><ul><ul><li>usually digital, but need not be </li></ul></ul><ul><ul><li>can be in any format or notation </li></ul></ul><ul><ul><li>can be text, graphics, video, audio, etc. </li></ul></ul><ul><li>This is like the content of the book to which the back-of-book index belongs </li></ul>information layer
  6. 6. (2) The Knowledge Layer <ul><li>The upper layer consists of topics and associations </li></ul><ul><ul><li>Topics represent the subjects that the information is about </li></ul></ul><ul><ul><ul><li>Like the list of topics that forms a back-of-book index </li></ul></ul></ul><ul><ul><li>Associations represent relationships between those subjects </li></ul></ul><ul><ul><ul><li>Like “see also” relationships in a back-of-book index </li></ul></ul></ul>knowledge layer composed by born in composed by Puccini Tosca Lucca Madame Butterfly
  7. 7. Linking the Layers Through Occurrences <ul><li>The two layers are linked together </li></ul><ul><ul><li>Occurrences are information resources that are pertinent to a given knowledge topic </li></ul></ul><ul><ul><li>The links (or locators) are like page numbers in a back-of-book index </li></ul></ul>Puccini Tosca Lucca composed by born in composed by Madame Butterfly knowledge layer information layer
  8. 8. Summary of Core Topic Maps Concepts <ul><li>= The TAO of Topic Maps </li></ul><ul><li>A pool of information or data </li></ul><ul><ul><li>any type or format </li></ul></ul><ul><li>A knowledge layer, consisting of: </li></ul>knowledge layer information layer <ul><li>A ssociations </li></ul><ul><ul><li>expressing relationships between knowledge topics </li></ul></ul>composed by born in composed by <ul><li>O ccurrences </li></ul><ul><ul><li>information that is relevant in some way to a given knowledge topic </li></ul></ul><ul><li>T opics </li></ul><ul><ul><li>a set of knowledge topics for the domain in question </li></ul></ul>Puccini Tosca Lucca Madame Butterfly
  9. 9. Topic Maps and Back-of-Book Indexes Cavalleria Rusticana, 71, 203-204 Mascagni, Pietro Cavalleria Rusticana , 71, 203-204 Rustic Chivalry , see Cavalleria Rusticana singers, 39-52 See also individual names baritone, 46 bass, 46-47 soprano, 41-42, 337 tenor, 44-45 occurrences (and types) topics with multiple names associations (and types) + other conventions (composer) n + multiple indexes • Index of names • Index of places • Index of subjects topics (and type s) Basic concepts:
  10. 10. Topic Maps and Ontologies <ul><li>The basic building blocks are </li></ul><ul><ul><li>Topics: e.g. “Puccini”, “Lucca”, “Tosca” </li></ul></ul><ul><ul><li>Associations: e.g. “Puccini was born in Lucca” </li></ul></ul><ul><ul><li>Occurrences: e.g. “ is a biography of Puccini” </li></ul></ul><ul><li>Each of these constructs can be typed </li></ul><ul><ul><li>Topic types: “ composer”, “ city”, “ opera” </li></ul></ul><ul><ul><li>Association types: “ born in”, “ composed by” </li></ul></ul><ul><ul><li>Occurrence types: “biography”, “street map”, “synopsis” </li></ul></ul><ul><li>All such types are also topics (within the same topic map) </li></ul><ul><ul><li>“ Puccini” is a topic of type “composer” … and “composer” is also a topic </li></ul></ul><ul><li>A topic map thus contains its own ontology </li></ul><ul><ul><li>(“Ontology” is here defined as the classes of things that exist in the domain…) </li></ul></ul><ul><li>Demo of the Omnigator </li></ul>
  11. 11. The Omnigator A free topic map browser and debugger Online demo:  Download: 
  12. 12. The Omnigator: A Generic Topic Map Browser <ul><li>An Omni vorous Topic Map Navi gator </li></ul><ul><ul><li>The Omnigator will Eat Anything (provided it’s a topic map!) </li></ul></ul><ul><ul><li>Any Ontology: including your own </li></ul></ul><ul><ul><li>Just drop your own topic map into the Omnigator directory and away you go! </li></ul></ul><ul><ul><li>The Omnigator makes “reasonable sense” out of any “reasonably sensible” topic map </li></ul></ul><ul><li>And it's Free! </li></ul><ul><ul><li>Download it from the Ontopia web site </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Or view it online at </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul>
  13. 13. How the Omnigator Works J2EE Web Server e.g. Tomcat Omnigator Ontopia Topic Map Engine topic map <HTML> pages http server client
  14. 14. current topic (multiple) names (multiple) types multiple occurrences multiple associations
  15. 15. With this Simple but Flexible Model You Can <ul><li>Make knowledge explicit, by </li></ul><ul><ul><li>Identifying the subjects that your information is about </li></ul></ul><ul><ul><li>Expressing the relationships between those subjects </li></ul></ul><ul><li>Bridge the domains of knowledge and information, by </li></ul><ul><ul><li>Describing where to find information about the subjects </li></ul></ul><ul><ul><li>Linking information about a common subject across multiple repositories </li></ul></ul><ul><li>Transcend simple categories, hierarchies, and taxonomies, by </li></ul><ul><ul><li>Applying rich associative structures that capture the complexity of knowledge </li></ul></ul><ul><li>Enable implicit knowledge to be made explicit, by </li></ul><ul><ul><li>Providing clearly identifiable hooks for attaching implicit knowledge </li></ul></ul><ul><li>But there’s more (of course)… </li></ul>
  16. 16. Supporting Context through Scope <ul><li>Topic Maps are about representing knowledge </li></ul><ul><li>Knowledge is not absolute; it has a contextual aspect </li></ul><ul><li>Context sensitivity is handled through the concept of scope </li></ul><ul><li>Scope makes it possible to </li></ul><ul><ul><li>Cater for the subjectivity of knowledge </li></ul></ul><ul><ul><li>Express multiple viewpoints in one knowledge base </li></ul></ul><ul><ul><li>Provide personalized views for different groups of users </li></ul></ul><ul><ul><li>Track the source of knowledge during merging </li></ul></ul><ul><li>(Scopes are defined as sets of topics) </li></ul>
  17. 17. How Scope Works <ul><li>Topics have “ characteristics” </li></ul><ul><ul><li>Its names and occurrences , and the roles it plays in associations with other topics </li></ul></ul><ul><li>Every characteristic is valid within some context (scope), e.g. </li></ul><ul><ul><li>the name “Ruotsi” for the topic Sweden in the scope “Finnish” </li></ul></ul><ul><ul><li>a certain information occurrence in the scope “technician” </li></ul></ul><ul><ul><li>a given association is true in the scope (according to) “ Authority X” </li></ul></ul>name occurrence association role association role name occurrence association role name T T name occurrence association role name T Filtering by scope
  18. 18. Applications of Scope <ul><li>Multiple world views </li></ul><ul><ul><li>Reality is ambiguous and knowledge has a subjective dimension </li></ul></ul><ul><ul><li>Scope allows the expression of multiple perspectives in a single Topic Map </li></ul></ul><ul><li>Contextual knowledge </li></ul><ul><ul><li>Some knowledge is only valid in a certain context , and not valid otherwise </li></ul></ul><ul><ul><li>Scope enables the expression of contextual validity </li></ul></ul><ul><li>Traceable knowledge aggregation </li></ul><ul><ul><li>When the source of knowledge is as important as the knowledge itself: </li></ul></ul><ul><ul><li>Scope allows retention of knowledge about the source of knowledge </li></ul></ul><ul><li>Personalized knowledge </li></ul><ul><ul><li>Different users have different knowledge requirements </li></ul></ul><ul><ul><li>Scope permits personalization based on personal references, skill levels, security clearance, etc. </li></ul></ul><ul><li>Demo of scope and filtering in the Omnigator </li></ul>
  19. 19. How Topic Maps Improve Access to Information <ul><li>Intuitive navigational interfaces for humans </li></ul><ul><ul><li>The topic/association layer mirrors the way people think </li></ul></ul><ul><li>Powerful semantic queries for applications </li></ul><ul><ul><li>A formal underlying data structure </li></ul></ul><ul><ul><li>Demo of querying in the Omnigator </li></ul></ul><ul><li>Customized views based on individual requirements </li></ul><ul><ul><li>Personalization based on scope </li></ul></ul><ul><li>Information aggregation “ sans frontiers” </li></ul><ul><ul><li>Topic Maps can be merged automatically… </li></ul></ul><ul><ul><li>Demo of merging in the Omnigator </li></ul></ul>
  20. 20. Principles of Merging in Topic Maps <ul><li>In Topic Maps, every topic represents some subject </li></ul><ul><li>The collocation objective requires exactly one topic per subject </li></ul><ul><ul><li>When two topic maps are merged, topics that represent the same subject should be merged to a single topic </li></ul></ul><ul><ul><li>When two topics are merged, the resulting topic has the union of the characteristics of the two original topics </li></ul></ul>Merge the two topics together... name occurrence association role T association role name occurrence association role name A second topic (in another topic map) “about” the same subject T Merge the two topics together... ...and the resulting topic has the union of the original characteristics name occurrence association role name T
  21. 21. Det uunngåelige Ibsen-eksempelet SNL SNL SNL SNL Skien kom-mune Cap Lex Cap Lex NBL Henrik Ibsen Hedda Gabler Skien Et dukkehjem A doll ’ s house skrev født i skrev “ virkelighet ” emnekart informasjon kunnskap Ibsen- senter Ibsen- senter Ibsen- senter Ibsen- senter Ibsen- senter Ibsen- senter Et dukkehjem Helmer Dr. Rank Fru Linde Krogstad Nora
  22. 22. Applications of Merging <ul><li>Information integration </li></ul><ul><ul><li>Information that spans multiple repositories can be merged to provide a unified view of the whole </li></ul></ul><ul><li>Knowledge sharing across the organization </li></ul><ul><ul><li>Knowledge captured in one part of an organization can be made available to the whole organization </li></ul></ul><ul><li>Distributed knowledge management </li></ul><ul><ul><li>There is no need to centralize knowledge management in order to make it sharable </li></ul></ul><ul><li>Knowledge sharing between organizations </li></ul><ul><ul><li>Information and knowledge can be shared without enforcing a common vocabulary </li></ul></ul>
  23. 23. Integrating Information Across Systems <ul><li>Topic Maps are designed for ease of merging! </li></ul><ul><ul><li>Multiple Topic Maps can be created from many different repositories of information ... and then merged to provide a unified view of the whole </li></ul></ul><ul><li>Typical Applications: </li></ul><ul><ul><li>Integration of hitherto disconnected “islands” of information within an enterprise </li></ul></ul><ul><ul><li>Federation of knowledge from multiple sources </li></ul></ul><ul><li>Advantages: </li></ul><ul><ul><li>Consolidated access to all related information </li></ul></ul><ul><ul><li>Does not require migration of existing content </li></ul></ul>Knowledge Space Information Space Order 2 Customer A Order 1 Customer B Product X Product Z Skill Q Product Y owns orders orders contains contains contains integrates integrates requires Customer database Order database Product database Skills database Customer database
  24. 24. What Makes Merging Possible? <ul><li>NOT the use of names, which are notoriously unreliable </li></ul><ul><ul><li>Names are not unambiguous </li></ul></ul><ul><ul><ul><li>(the homonym problem) </li></ul></ul></ul><ul><ul><li>Many topics have multiple names </li></ul></ul><ul><ul><ul><li>(the synonym problem) </li></ul></ul></ul><ul><li>Reliable knowledge aggregation is only possible through the use of unique global identifiers </li></ul><ul><li>The issue of identification of subjects is crucial </li></ul><ul><ul><li>If subjects have unique identifiers , people can be free to use whatever names they like – and machines can still aggregate information </li></ul></ul>
  25. 25. The Crucial Concept of Subject Identification <ul><li>Topics exist in order to allow us to discourse about subjects </li></ul><ul><li>It is crucially important to be able to establish exactly which subject a topic represents, i.e. to establish its subject identity </li></ul><ul><ul><li>Without the ability to know when applications are talking about the same thing, there can be no interoperability </li></ul></ul><ul><li>The most prevalent method of establishing identity in today’s networked environments is to use URIs </li></ul>COMPUTER DOMAIN “ REALITY ” knowledge layer information layer composed by born in composed by Puccini Tosca Lucca Madame Butterfly
  26. 26. Using URIs to Identify Resources
  27. 27. Addressable and Non-addressable Subjects <ul><li>URIs are the addresses of resources </li></ul><ul><li>They work fine when subject is a resource (e.g. a document) </li></ul><ul><ul><li>It exists somewhere within the computer system, has a location, and can therefore be “addressed” </li></ul></ul><ul><ul><ul><li>For example, this presentation might be located at </li></ul></ul></ul><ul><ul><li>The address of an addressable subject is sufficient to unambiguo establish the subject’s identity </li></ul></ul><ul><ul><li>This is called the subject address </li></ul></ul><ul><li>But most subjects are not information resources </li></ul><ul><ul><li>Puccini, Lucca, Tosca, Madame Butterfly, love, darkness, French, … </li></ul></ul><ul><ul><li>These all exist outside the computer domain and cannot be addressed directly </li></ul></ul>
  28. 28. Subject Indicators <ul><li>The identity of non-addressable subjects is established indirectly </li></ul><ul><ul><li>Through an information resource (like a definition or a picture) that provides some kind of indication of the subject’s identity to a human </li></ul></ul><ul><ul><li>Such a resource is called a subject indicator </li></ul></ul><ul><ul><li>A topic may have multiple subject indicators </li></ul></ul><ul><li>Because it is a resource, a subject indicator has an address, even though the subject that it is indicating does not </li></ul><ul><ul><li>Computers can use the address of the subject indicator to establish identity </li></ul></ul><ul><ul><li>These are called subject identifiers </li></ul></ul><ul><ul><li>Subject indicators and subject identifiers are the two sides of the human-computer dichotomy </li></ul></ul> © 2002 Ontopia AS Life, the Universe and Everything The Computer Domain The Topic Map Domain subject Giacomo Puccini, Italian composer, b. Lucca 22nd Dec 1858, d. Brussels, 29th Nov 1924. Best known for his operas, of which Tosca is the most . . . subject indicator Puccini subject identifier topic
  29. 29. Published Subjects <ul><li>A subject indicator that has been made available for use outside one particular application is called a published subject indicator (PSI) </li></ul><ul><ul><li>Anyone can publish PSI sets </li></ul></ul><ul><ul><li>Adoption of PSI sets will be an evolutionary process that will lead to greater and greater interoperability – between topic map applications, between topic maps and RDF, and across the Semantic Web in general </li></ul></ul><ul><ul><li>Publishers and users of ontologies may be among the greatest beneficiaries </li></ul></ul><ul><li>OASIS technical committees </li></ul><ul><ul><li>pubsubj: </li></ul></ul><ul><ul><ul><li>Guidelines for publishing PSI sets </li></ul></ul></ul><ul><ul><li>geolang: </li></ul></ul><ul><ul><ul><li>A PSI set for geographical and language subjects </li></ul></ul></ul><ul><ul><ul><li>Based on existing standards (e.g. ISO 639, ISO 3166) </li></ul></ul></ul><ul><ul><li>xmlvoc: </li></ul></ul><ul><ul><ul><li>A PSI set for an ontology of XML and related standards </li></ul></ul></ul>
  30. 30. Using URIs to Identify Arbitrary Subjects
  31. 31. Using URIs to Identify Resources
  32. 32. A Plea to Ontology Developers <ul><li>Make them publicly available! </li></ul><ul><li>Define URIs as unique identifiers for the concepts in your ontologies – including the relationship types </li></ul><ul><ul><li> </li></ul></ul><ul><li>Follow the Recommendations of the OASIS Published Subjects TC: </li></ul><ul><ul><li>Make sure they resolve to human-readable resources </li></ul></ul><ul><ul><li>Guarantee their stability </li></ul></ul><ul><li>This will allow human users to use different terminology </li></ul><ul><ul><li>Bovine Spongiform Encephalopathy (?), BSE </li></ul></ul><ul><ul><li>Mad Cow Disease </li></ul></ul><ul><ul><li>Kugalskap, La vache folle, etc. </li></ul></ul><ul><li>And enable interoperability and reuse across applications : </li></ul><ul><ul><li>Topic Maps, RDF, DAML+OIL, OWL, KIF, XML, etc. </li></ul></ul>
  33. 33. Topic Maps for Machine Agents <ul><li>A formal data structure suitable for data processing </li></ul><ul><li>Support for rich semantic queries </li></ul><ul><li>High degree of built-in semantics simplifies application development </li></ul><ul><li>Published Subjects enable widespread and spontaneous knowledge interchange </li></ul><ul><li>International standard interchange syntax </li></ul><ul><li>Potential for wide adoption means more data for agents </li></ul>
  34. 34. Topic Maps for Human Agents <ul><li>A way of representing knowledge that corresponds to how humans think about the world </li></ul><ul><ul><li>Organized around subjects not resources </li></ul></ul><ul><ul><li>Direct support for context sensitivity </li></ul></ul><ul><li>A level of built-in semantics that makes the model easy to understand </li></ul><ul><ul><li>Distinguishes between names, occurrences and associations </li></ul></ul><ul><ul><li>Privileges the class-instance relationship </li></ul></ul><ul><li>Associative model matches how the brain works </li></ul><ul><ul><li>Typed associations provide a rich and intuitive navigational interface </li></ul></ul>
  35. 35. Topic Map-Driven Knowledge Portals <ul><li>Let the index drive the presentation! </li></ul><ul><ul><li>The Topic Map structure governs the application – and the knowledge </li></ul></ul><ul><li>Users navigate intuitively from topic to topic </li></ul><ul><ul><li>Having found the appropriate topic, they </li></ul></ul><ul><ul><ul><li>immediately see all recorded explicit knowledge </li></ul></ul></ul><ul><ul><ul><li>can dip down into information resources to “ extract” implicit knowledge </li></ul></ul></ul><ul><li>Publisher benefits: </li></ul><ul><ul><li>Easier content maintenance (simply update the Topic Map) </li></ul></ul><ul><ul><li>Easier link maintenance (links are in separate layer, not in content) </li></ul></ul><ul><ul><li>New portals easy to derive from same content </li></ul></ul><ul><li>User benefits: </li></ul><ul><ul><li>Shorter click-through </li></ul></ul><ul><ul><li>Easier, more intuitive navigation mirrors associative way of thinking </li></ul></ul><ul><ul><li>Far greater structural consistency means less confusion </li></ul></ul><ul><li>Demo of the OperaMap portal </li></ul>
  36. 36. For More Information <ul><li>“ Getting Started with Topic Maps” </li></ul><ul><li>Ontopia web site </li></ul><ul><ul><li> </li></ul></ul><ul><li>/me </li></ul><ul><ul><li>[email_address] </li></ul></ul><ul><li>Finally </li></ul><ul><ul><li>Ontopia is the world’s leading Topic Map company </li></ul></ul><ul><ul><li>We are interested in participating in EU projects </li></ul></ul><ul><ul><li>Please contact me for more details </li></ul></ul>