Introduction to knowledge representation in the semantic web

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite & 1 Group

    Introduction to knowledge representation in the semantic web - Presentation Transcript

    1. Introduction to the Semantic Web Fulvio Corno, Laura Farinetti Politecnico di Torino Dipartimento di Automatica e Informatica e-Lite Research Group – http://elite.polito.it
    2. Semantic Web http://www.w3.org/2001/sw/ Web second generation Web 3.0 “Conceptual structuring of the Web in an explicit machine-readable way” (Tim Berners-Lee) In other words… …let the machine do most of the work!!! F. Corno, L. Farinetti- Politecnico di Torino 2
    3. “Official” introduction The Semantic Web is a web of data. There is lots of data we all use every day, and its not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar? Why not? Because we don’t have a web of data. Because data is controlled by applications, and each application keeps it to itself F. Corno, L. Farinetti- Politecnico di Torino 3
    4. Example F. Corno, L. Farinetti- Politecnico di Torino 4
    5. “Official” introduction The Semantic Web is about two things It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing F. Corno, L. Farinetti- Politecnico di Torino 5
    6. An example … How can a machine distinguish the meanings … ? “I am a professor of computer science.” “I am a professor of computer science, you may think. Well,…” F. Corno, L. Farinetti- Politecnico di Torino 6
    7. Key principles The Semantic Web is the Web Same base technologies, evolutionary Decentralized (incomplete, inconsistent) Provide explicit statements regarding web resources Authors, original information providers Intermediaries (humans and/or machines) Information consumers determine consequences of the statements Distributed ‘reasoning’ F. Corno, L. Farinetti- Politecnico di Torino 7
    8. 1989: WWW original proposal F. Corno, L. Farinetti- Politecnico di Torino 8
    9. Technology stack (old: pre-2008) F. Corno, L. Farinetti- Politecnico di Torino 9
    10. Technology stack (current: 2008) F. Corno, L. Farinetti- Politecnico di Torino 10
    11. Goal of the semantic Web The Semantic Web will enable machines to COMPREHEND semantic documents and data, NOT human speech and writing Then, how??? Semantic Web foundation: metadata F. Corno, L. Farinetti- Politecnico di Torino 11
    12. Metadata and Metadata Standards
    13. Resource and description Resource Content, format, … Access method dependent on format (I can read it if I “know” its language) Resource description Independent of the format (I can read “people’s comments” about the resource… provided that I know the language in which the comment is written) F. Corno, L. Farinetti- Politecnico di Torino 13
    14. Resource and description this resource is suitable the title of this description for PhD students resource is “Introduction to resource the Semantic Web” this resource was created on the author of April 14th, 2009 this resource is L. Farinetti this resource is related to the quality of computer this resource science, is high, knowledge according to representation F. Corno and metadata F. Corno, L. Farinetti- Politecnico di Torino 14
    15. Resource and description Resource Content, format, … Access method dependent on format (I can read it if I “know” its language) Standardization (i.e. common language for applications) ??? Practically impossible … Huge amount of existing information Hundreds of human languages Hundreds of computer languages (other word for formats) F. Corno, L. Farinetti- Politecnico di Torino 15
    16. Resource and description Resource description Independent of the format (I can read “people’s comments” about the resource… provided that I know the language in which the comment is written) Standardization (i.e. common language for applications) ??? Feasible Smaller amount of information, possibly new Solution: define a standard language for writing comments (“metadata” in semantic web terminology) F. Corno, L. Farinetti- Politecnico di Torino 16
    17. Resource and description this resource is suitable the title of this for PhD students resource is “Introduction to the Semantic Web” this resource was created on the author of April 14th, 2009 this resource is L. Farinetti Metadata this resource is related to the quality of computer Field name = field value this resource science, is high, knowledge according to representation F. Corno and metadata F. Corno, L. Farinetti- Politecnico di Torino 17
    18. Resource and description Level = PhD students description Title = “Introduction to the Semantic resource Web” Date = 2009-04-14 Author = L. Farinetti Topic = {computer science, Quality = high knowledge representation, Rated by F. Corno Rated by F. Corno metadata} F. Corno, L. Farinetti- Politecnico di Torino 18
    19. Semantic Web main tasks Metadata annotation Description of resources using standard languages Search Retrieve relevant information according to user’s query / interest / intention Use metadata (and possibly content) in a “smart” way (i.e. “reasoning” about the meaning of annotations) F. Corno, L. Farinetti- Politecnico di Torino 19
    20. Meaningful metadata annotations Common language for describing resources Resource description standards Common language for description field names Metadata standards Common language for description field values Metadata standards + controlled vocabularies Semantically rich descriptions to support search Knowledge representation techniques, ontologies F. Corno, L. Farinetti- Politecnico di Torino 20
    21. Common language for describing resources F. Corno, L. Farinetti- Politecnico di Torino 21
    22. Common language for describing resources Resource Description Framework (RDF) Resource = URI (retrievable, or not) RDF is structured in statements A statement is a triple Subject – predicate – object Subject: a resource Predicate: a verb / property / relationship Object: a resource, or a literal string F. Corno, L. Farinetti- Politecnico di Torino 22
    23. Common language for describing resources Author = L. Farinetti Diagram: hasAuthor URI L.Farinetti Simple RDF assertion (triple): triple (hasAuthor, URI, L.Farinetti) F. Corno, L. Farinetti- Politecnico di Torino 23
    24. Common language for describing resources Author = L. Farinetti RDF in XML syntax: <RDF xmlns=“http://www.w3.org/TR/ … ” > <RDF xmlns=“http://www.w3.org/TR/ … ” > <Description about=“http://www.polito.it/semweb/intro”> <Description about=“http://www.polito.it/semweb/intro”> <Author>L.Farinetti</Author> <Author>L.Farinetti</Author> </Description> </Description> </RDF> </RDF> F. Corno, L. Farinetti- Politecnico di Torino 24
    25. Common language for field names Problem Topic = … Topics, Subject, Subjects, Level = … Argument, Arguments Title = ... Educational level, destination, suitability, … Singular / plural Difficult to clearly Date = … define concept in a Author = … few words Date of creation, date of Creator, Maker, last modification, date of Contributor … revision, … Different concepts: Synonymy need for more details F. Corno, L. Farinetti- Politecnico di Torino 25
    26. Common language for field names Solution: metadata standards Many standardization bodies are involved Standards may be general e.g. Dublin Core (DC) or may depend on goal, context, domain, … e. g. educational resources (IEEE LOM), multimedia resources (MPEG 7), images (VRA), people (FOAF, - IEEE PAPI), geospatial resources (GSDGM), bibliographical resources (MARC, OAI), cultural heritage resources (CIDOC CRM) F. Corno, L. Farinetti- Politecnico di Torino 26
    27. Metadata standards examples F. Corno, L. Farinetti- Politecnico di Torino 27
    28. Dublin Core Dublin Core Metadata Element Set (DCMES) Building blocks to define metadata for the Semantic Web 15 elements, or categories, general enough to describe most of the published resources Extra elements and element refinements F. Corno, L. Farinetti- Politecnico di Torino 28
    29. DC metadata element set F. Corno, L. Farinetti- Politecnico di Torino 29
    30. Example of description using Dublin Core (in RDF) A paper in the “Ariadne” journal F. Corno, L. Farinetti- Politecnico di Torino 30
    31. Common language for field values Problems Value type Date = 2009-04-14 type = date Title = “Introduction to the Semantic Author = Web” L. Farinetti type = string type = string “standard” format? Laura Farinetti, Farinetti Laura, Farinetti L., … F. Corno, L. Farinetti- Politecnico di Torino 31
    32. Common language for field values Problems Level = PhD students Value type any value? list of possible values? Value restrictions? freedom vs shared understanding Topic = Quality = high {computer science, knowledge High, medium, low? representation, 1 to 5? metadata} any value? any value? any number of values? F. Corno, L. Farinetti- Politecnico di Torino 32
    33. Common language for field values Solution: metadata standards + controlled vocabularies Metadata standards Only some, and partially Controlled vocabularies Explicit list of possible values F. Corno, L. Farinetti- Politecnico di Torino 33
    34. Examples from IEEE LOM 1484.12.1 - 2002 Learning Object Metadata (LOM) Standard Developed by the IEEE Learning Technology Standards Committee (LTSC) Standard to describe the “Learning Objects” in order to guarantee their interoperability F. Corno, L. Farinetti- Politecnico di Torino 34
    35. Examples from IEEE LOM F. Corno, L. Farinetti- Politecnico di Torino 35
    36. Examples from IEEE LOM F. Corno, L. Farinetti- Politecnico di Torino 36
    37. Examples from IEEE LOM F. Corno, L. Farinetti- Politecnico di Torino 37
    38. … + controlled vocabularies A closed list of named subjects, which can be used for classification Topic = Metadata field values are {computer science, restricted to a list of terms informatics, (selected by experts) knowledge representation, metadata} F. Corno, L. Farinetti- Politecnico di Torino 38
    39. Semantically rich descriptions to support search F. Corno, L. Farinetti- Politecnico di Torino 39
    40. Semantically rich descriptions to support search http://dictybase.org/db/html/help/GO.html http://dictybase.org/db/html/help/GO.html Topic = {metabolism, …} F. Corno, L. Farinetti- Politecnico di Torino 40
    41. Knowledge Representation
    42. Need for knowledge representation Semantically rich descriptions need “understanding” the meaning of a resource and the domain related to the resource Disambiguation of terms Shared agreement on meanings Description of the domain, with concepts and relations among concepts F. Corno, L. Farinetti- Politecnico di Torino 42
    43. Example: Dublin Core metadata Metadata of a single paper F. Corno, L. Farinetti- Politecnico di Torino 43
    44. Problems Title usually offers good clues, but it does not necessarily mention all names of all subjects the user is interested in it may presuppose knowledge the user does not actually possess Subject is meant to convey precisely what the document is about, but much depends on how extensive the set of keywords is, whether all related subjects are mentioned, and whether too many subjects are listed Metadata does not say much about “how related” a resource is to a given subject F. Corno, L. Farinetti- Politecnico di Torino 44
    45. Search results for “topic maps” F. Corno, L. Farinetti- Politecnico di Torino 45
    46. Problems Authors were free to define their own subject keywords Results are not “about” topic maps, but “related to” topic maps If an author forgets to list “topic maps”, his paper will never be found F. Corno, L. Farinetti- Politecnico di Torino 46
    47. Subject-based classification Any form of content classification that groups objects by their subjects e.g the use of keywords to classify papers Metadata fields describe what the objects are about by listing discrete subjects inside a subject-based classification Important: difference between describing the objects being classified and describing the subjects used to classify them Metadata describe objects Subject bsed classification is the approach to -a describe subject F. Corno, L. Farinetti- Politecnico di Torino 47
    48. Subject-based classification ... “On those remote pages it is written that animals are divided into: a. those that belong to the Emperor b. embalmed ones c. those that are trained From The Celestial Emporium of d. suckling pigs Benevolent Knowledge, Borges e. mermaids f. fabulous ones g. stray dogs h. those that are included in this classification i. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance\" F. Corno, L. Farinetti- Politecnico di Torino 48
    49. Subject-based classification techniques Controlled vocabularies Taxonomies Thesauri Faceted classification Ontologies Folksonomies Others … Most come from library science F. Corno, L. Farinetti- Politecnico di Torino 49
    50. Controlled vocabulary A closed list of named subjects, which can be used for classification Composed of terms: particular name for a particular concept similar to keywords Terms are not concepts A single term may be the name of one or more concepts A single concept may have multiple names Ambiguity avoided by forbidding duplicate terms F. Corno, L. Farinetti- Politecnico di Torino 50
    51. Topic = Controlled vocabulary {computer science, knowledge representation, mtadata, RDF, topic navigation maps} topic maps Goal Prevent authors from defining terms that are meaningless, too broad or too narrow Prevent authors from misspelling Prevent different authors from choosing slightly different forms of the same term The simplest form of controlled vocabulary is a list of terms (or “pick list”) F. Corno, L. Farinetti- Politecnico di Torino 51
    52. Controlled vocabulary Reduce ambiguity inherent in normal human languages Solve the problems of homographs, homonyms, synonyms and polysemes by ensuring That each concept is described using only one authorized term That each authorized term in the controlled vocabulary describes only one concept F. Corno, L. Farinetti- Politecnico di Torino 52
    53. Problems solved Synonym different words with identical or very similar meanings F. Corno, L. Farinetti- Politecnico di Torino 53
    54. Problems solved “Will you please close that door!” close “The tiger was now so close that I could smell it...” student pupil opening in the iris of the eye ('æk.səz) plural of axe axes ('æk.siz) plural of axis Synonym different words with identical or very similar meanings F. Corno, L. Farinetti- Politecnico di Torino 54
    55. Problems solved take (I'll get the drinks) become (she got scared) to get understand (I get it) a piece of a tree wood a geographical area with many trees Synonym different words with identical or very similar meanings student and pupil (noun) buy and purchase (verb) sick and ill (adjective) F. Corno, L. Farinetti- Politecnico di Torino 55
    56. Controlled vocabulary examples Blood Blood Circuit theory Circuit theory Cord blood Cord blood Electronic circuits Electronic circuits Erythrocyte Erythrocyte Microwave technology Microwave technology Leukocyte Leukocyte Electron tubes Electron tubes Basophil Basophil Semiconductor materials and devices Semiconductor materials and devices Eosynophil Eosynophil Dielectric materials and devices Dielectric materials and devices Lymphoblast Lymphoblast Magnetic materials and devices Magnetic materials and devices Lymphocyte Lymphocyte Superconducting materials and devices Superconducting materials and devices Monocyte Monocyte … … Neutrophil Neutrophil … … Practically no “real” examples With very little extra effort: taxonomies and thesauri! F. Corno, L. Farinetti- Politecnico di Torino 56
    57. Taxonomy Subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy Dates back to Carl Linnæus’s work on zoological and botanical classification (18th century) F. Corno, L. Farinetti- Politecnico di Torino 57
    58. Taxonomy Allow related terms to be grouped together It is clear that “topic maps” and “XTM” are related Easier to classify documents Easier to choose search keywords F. Corno, L. Farinetti- Politecnico di Torino 58
    59. Taxonomies and metadata Metadata are stored as usual with the resource The “subject” will contain only controlled terms Controlled terms belong to a hierarchy, shared by all papers F. Corno, L. Farinetti- Politecnico di Torino 59
    60. Taxonomy example: INSPEC http://www.theiet.org/publishing/inspec/index.cfm http://www.theiet.org/publishing/inspec/index.cfm F. Corno, L. Farinetti- Politecnico di Torino 60
    61. Taxonomy example: INSPEC F. Corno, L. Farinetti- Politecnico di Torino 61
    62. INSPEC journal article database F. Corno, L. Farinetti- Politecnico di Torino 62
    63. Taxonomy example: anatomy terms http://www.cbil.upenn.edu/anatomy.php3 http://www.cbil.upenn.edu/anatomy.php3 F. Corno, L. Farinetti- Politecnico di Torino 63
    64. Taxonomy example F. Corno, L. Farinetti- Politecnico di Torino 64
    65. Taxonomy example http://www.acm.org/class/1998/ccs98.html http://www.acm.org/class/1998/ccs98.html F. Corno, L. Farinetti- Politecnico di Torino 65
    66. Taxonomy limits Only two kinds of relationships between terms Parent = broader term Child = narrower term no more in use synonym topic navigation maps difference? synonym XML topic map difference? F. Corno, L. Farinetti- Politecnico di Torino 66
    67. Thesaurus Extends taxonomies subjects are arranged in a hierarchy Other statements can be made about the subjects Two ISO standards ISO2788 for monolingual thesauri ISO5964 for multilingual thesauri F. Corno, L. Farinetti- Politecnico di Torino 67
    68. Thesaurus relationships BT – broader term Refers to a term with wider or less specific meaning Some systems allow multiple BTs for one term, while others do not Inverse property: NT- narrower term A taxonomy only uses BT and NT SN – scope note String explaining its meaning within the thesaurus Useful when the precise meaning of the term is not obvious from context F. Corno, L. Farinetti- Politecnico di Torino 68
    69. Thesaurus relationships USE Another term that is to be preferred instead of this term Implies that the terms are synonymous Inverse property: UF TT – top term The topmost ancestor of this term The BT of the BT of the BT... RT – related term A term that is related to this term, without being a synonym of it or a broader/narrower term F. Corno, L. Farinetti- Politecnico di Torino 69
    70. Thesaurus example http://www.ukat.org.uk/thesaurus/ http://www.ukat.org.uk/thesaurus/ F. Corno, L. Farinetti- Politecnico di Torino 70
    71. Thesaurus example http://www.swinburne.edu.au/corporate/registrar/rms/keywords.htm http://www.swinburne.edu.au/corporate/registrar/rms/keywords.htm F. Corno, L. Farinetti- Politecnico di Torino 71
    72. Thesaurus example Library of Congress Subject Heading http://www.loc.gov/cds/lcsh.html http://www.loc.gov/cds/lcsh.html F. Corno, L. Farinetti- Politecnico di Torino 72
    73. Faceted classification Proposed by S.R. Ranganathan in the ‘30s Facets are the different axes along which documents can be classified Each facet contains a number of terms Usually with a thesaurus organization Usually a term belongs to one facet only A document is classified by selecting one term from each facet F. Corno, L. Farinetti- Politecnico di Torino 73
    74. Faceted classification example http://flamenco.berkeley.edu/ http://flamenco.berkeley.edu/ F. Corno, L. Farinetti- Politecnico di Torino 74
    75. Advantages Multi- dimensionality Persistence Scalability Flexibility http://freeable.polito.it/ http://freeable.polito.it/ F. Corno, L. Farinetti- Politecnico di Torino 75
    76. Ontology Model for describing the world that consists of a set of types, properties, and relationships Extends the other subject-based classification approaches Has open vocabularies Has open relationship types (not just BT/NT, RT and USE/UF) F. Corno, L. Farinetti- Politecnico di Torino 76
    77. Ontology structure Concepts Relationships Is-a Other Instances F. Corno, L. Farinetti- Politecnico di Torino 77
    78. Ontology example http://onto.stanford.edu:8080/wino/index.jsp http://onto.stanford.edu:8080/wino/index.jsp F. Corno, L. Farinetti- Politecnico di Torino 78
    79. Folksonomy Internet-mediated social environments Tags compiled through social tagging Social tagging Decentralized practice where individuals and groups create, manage and share tags to annotate digital resources in an online social environment Generally characterized by non-standard tagging F. Corno, L. Farinetti- Politecnico di Torino 79
    80. Example (flickr - del.icio.us) Knowledge Management - Intro 80
    81. Other subject-based techniques Synonym rings Connect together a set of terms as being equivalent for search purpose Similar to UF/USE relationship of thesauri, but no preferred term F. Corno, L. Farinetti- Politecnico di Torino 81
    82. Other subject-based techniques Authority file Similar to a synonym ring, but consists of UF/USE relationships instead of synonym relationships One term in each synonym ring is indicated as the preferred term for that subject e.g. Library of Congress Name Authority File F. Corno, L. Farinetti- Politecnico di Torino 82
    83. Subject-based classification summary Terminology is rarely Terminology is rarely used in a consistent way used in a consistent way Controlled vocabularies Controlled vocabularies are thesauri, thesauri are are thesauri, thesauri are ontologies, … ontologies, … http://www.iesr.ac.uk/profile/vocabs/index.html/#CtrldVocabsList http://www.iesr.ac.uk/profile/vocabs/index.html/#CtrldVocabsList F. Corno, L. Farinetti- Politecnico di Torino 83
    84. Subject-based classification summary F. Corno, L. Farinetti- Politecnico di Torino 84
    85. Coming next … More on RDF More on Ontologies F. Corno, L. Farinetti- Politecnico di Torino 85
    86. License This work is licensed under the Creative Commons Attribution-Noncommercial- Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by- nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. F. Corno, L. Farinetti- Politecnico di Torino 86

    + Fulvio CornoFulvio Corno, 6 months ago

    custom

    1184 views, 1 favs, 2 embeds more stats

    A general introduction to the semantic web. Major k more

    More info about this document

    CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

    Go to text version

    • Total Views 1184
      • 1180 on SlideShare
      • 4 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 102
    Most viewed embeds
    • 3 views on http://www.kuehleborn.org
    • 1 views on http://kuehleborn.org

    more

    All embeds
    • 3 views on http://www.kuehleborn.org
    • 1 views on http://kuehleborn.org

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Groups / Events