SlideShare a Scribd company logo
1 of 59
24-06-2010




      Web 3.0, Semantics &                 Session II –
      Enterprise Computing             Logical Ontological




                                              Satish, Sukumar,
                                        Feroz Sheikh & Venkatesh S
           www.canopusconsulting.com            June 2010




     Objective
          Information interchange and modelling

          An introduction to semantic web technologies
               RDF, RDFS, OWL
2

               Semantic modelling
               Querying


          The web 3.0 technology stack



© Canopus Consulting




                                                                             1
24-06-2010




      The following Sessions will address:
          How do we build an application?
          How do we build the ontology?
          What are the key architecture components?
          What are the tools & technologies to use?
           How do I choose which technology to use?
3
      




© Canopus Consulting




      Semantic Web Application Lifecycle

                                              Ontology Editors:
                                              Protégé, TopBraid
                                                 Composer




4               Build Information Model




                                          Semantic Query
                                             Server

       Refine/Evolve
    Information Model
                                                                   Create Assimilation Models
                                                                    & Aggregate knowledge
             RDF Stores:
            Mulgara, Sesame                                               Technologies:
                                                                         GRDDL, RDFizers,
          Programming: Jena                                              OWLs, Automatic
                                                                            Annotation


                                  Retrieve and Use Semantic Data
© Canopus Consulting




                                                                                                        2
24-06-2010




     Semantic Web Application Lifecycle
          Information Modelling
               Build Ontology (model level representation)

          Information Assimilation
               Populate Knowledgebase from various sources
5
               Including current applications
               Automatic Semantic Annotation of existing data
               Any type of document, multiple sources of documents

          Information Retrieval
               Applications: search, integrate/portal, summarize/
                explain, analyse, decisions support
               Reasoning techniques: graph analysis, inferencing
© Canopus Consulting




     Architecture Stack of Semantic Technologies

                  Application
                                             HTTP                     SOAP


                                                    Programming API

           Semantic Middleware
6           e.g. Semantic SOA           SPARQL Processor       Inference Engine


                                         RDF-SQL
                                          Adaptor




                                                 Relational
                                                              RDF Store
                                                   Store


                                               Semantic Technology Stack



© Canopus Consulting




                                                                                          3
24-06-2010




     Semantic Web Technologies




7




                                                    Source: W3C


© Canopus Consulting




     The Perceptron.Net Use case
          A rich Cultural Informatics environment
           designed to
               Create, Collect, Categorize any type of cultural
                artifact – Music, Literature, Travel, Leisure,
8
                Entertainment..
               Communities can be formed around content
               Make use of existing information on the network
                and existing community infrastructure
          An example:
               Indian Music cannot be categorized along the
                same lines as Western Music
               Genre, Album, Artist – is just not sufficient…
© Canopus Consulting




                                                                           4
24-06-2010




     The Perceptron.Net use case…
          Typical Queries we want to support:
          Thematic Album Creation Ability:
               Give me all songs that are directed by X, and music
                composed by “y” and hero was “z”
9
               Give me all songs in Raga Kalyani – (must include
                film, folk and classical songs)
               Give me all songs in Lord Rama in Sanskrit, which
                are “stotras”…
               Give me all the recordings of live performacnes at
                Sri Krishna Gana Sabha, Chennai


© Canopus Consulting




     The Perceptron.Net use case…
          Provide an exploratory interface:
               Specify a generic criteria and successively filter until you find what you
                need. E.g: specify a “mood” or a song you like and ask for “similar” songs
                or songs that match such a mood.
          Allow community to add content, meta-data and find new
           connections in the content.
10
          Content can be anywhere on the Internet
               Raaga.com, HamaraCd.com, MusicToday.Com, Orkut groups, blogs,
                websites
          Not only music, but include content “about” music – articles, essays,
           ratings, discussions – which should be used in connecting the
           content, in searching the content, in enriching the content
          Provide feeds such that facebook type plug-in can be developed
           easily – so that content and queries can be shared/updated from
           anywhere.


© Canopus Consulting




                                                                                                     5
24-06-2010




11




© Canopus Consulting




     Practical Problem

12   LET US CREATE A COMPILATION OF
     AMITABH BACHAN’S FILM SONGS




© Canopus Consulting




                                              6
24-06-2010




     We start with a music site
                                      Or HamaraCD or India
                                      Times Shopping or …

                                      And we go through a fairly
                                      exhaustive categorization
                                      …

13
                                      By actor
                                      By album name
                                      Release year




© Canopus Consulting




     Or go to a site with a filmography


                                     And drill down to some
                                     of our favorite ones

14




© Canopus Consulting




                                                                           7
24-06-2010




     But what if we wanted to
          Get all songs for Amitabh Bachchan that
               Are from movies directed by Yash Chopra?



15




© Canopus Consulting




     Or what if we wanted to

                                                 Get all songs of Amitabh
                                                 Bachan in which the
                                                 playback singer won an
                                                 award.


16




© Canopus Consulting




                                                                                    8
24-06-2010




     We could do this
          By hand
               Make a list of movies from the filmography site
               Go to Raaga and get songs for those movies
          But what if there are multiple sources?
17




© Canopus Consulting




     Across Multiple Sources




18




© Canopus Consulting




                                                                          9
24-06-2010




     The problem
        Implicit to Explicit
           Logic & domain knowledge is “Implicit” in an application
             In Code, DB Schema, Documentation etc.

          Humans can interpret it, but automated agents can’t

         List<SongList>: getSongsInAlbum();
19
                                  SongID             Sequence           Duration
                                  Yaara Seeli        1                  4:50
                                  Seeli
                                  Dekha Ek           2                  2:55
                                  Khwab toh
                                  …                  …                  …


                 For agents, this logic needs to be “Explicitly” stated


© Canopus Consulting




     To enable agents,
            Extract the Logic as Metadata
                The logic & domain knowledge is embedded in the application
                      In object models, class hierarchies, instance names
                This needs to be extracted out, for others to link, access and use
            Deliver both data and metadata in a uniform manner
20




© Canopus Consulting




                                                                                             10
24-06-2010




     We need
           Common Language
                   Agents need a common language to interchange information

           Derive Conclusions
                   Logic, that is now explicitly stated, can be used to draw conclusions

21
           Allow Independent Evolution
                   Every application should be able to evolve independently
                   Both at data level (new data) or schema level (new knowledge)

           Support Incremental Assimilation
                   Different source can contain/provide different aspects of knowledge
                   Community participation to evolve the knowledge
                   Manifests itself in the principle of AAA (Anyone can say anything
                    about any topic)

© Canopus Consulting




     Steps In Information Interchange
       1.   Map the various data onto an abstract data
            representation
                    Make the data independent of its internal
                     representation…
22                  Expose it in a standard format
       2.    Merge the resulting representations to create
            a single model
       3.    Make queries on the whole
                    Queries that could not have been done on the
                     individual data sets


© Canopus Consulting




                                                                                                   11
24-06-2010




     Steps In Information Interchange




23




© Canopus Consulting




     Enabling Technologies
          Semantic technologies (specifically RDF) treat
           metadata as data and exchange both in exactly
           the same way.
               They provide a way for anyone to make a basic
24              statement about anything and a means of
                layering these statements into a model.




© Canopus Consulting




                                                                       12
24-06-2010




     Interchange Without APIs
          If the information provider can now produce
           an RDF stream of data and metadata, the
           assimilation agent can combine streams from
           all sources and treat them as if they are from
25         the same source
          The only thing the agent needs to understand
           the model of RDF, and not the individual
           application APIs and models
          This is the fundamental premise of
           information interchange on the web 3.0.

© Canopus Consulting




     First Expose the Data
          As a set of relations
          Using RDF           Jai Ho
                                              p:title            http://.../#JaiHo

                                                   p:movie

                             Slumdog…
                                                        p:composer



26



                                              http://.../#ARR


                                          p:name

                           AR Rehman




                              Song Data


© Canopus Consulting




                                                                                            13
24-06-2010




     Second Merge Data from different Sources

                           p:title                                                  Same URI =
                                                                                    Same URI =
                                              http://.../#JaiHo
            Jai Ho                                                                    Same
                                                                                      Same
                                p:movie                                              Resource
                                                                                     Resource
         Slumdog…
                                     p:composer



27                                                                                     http://.../#JaiHo


                           http://.../#ARR
                                                                              p:sang

                       p:name                                                                              http://.../#ARR

     AR Rehman
                                                                        http://.../#JaiHo
                                                                              -ARR                 p:performed_at

                                                          p:music_director



          Song Data                           http://.../#ARR
                                                                                    Performances Data


© Canopus Consulting




     Merging Identical Resources

                           p:title            http://.../#JaiHo
            Jai Ho
                                p:movie

         Slumdog…
                                     p:composer



28                                                                                     http://.../#JaiHo


                           http://.../#ARR
                                                                              p:sang

                       p:name                                                                              http://.../#ARR

     AR Rehman
                                                                        http://.../#JaiHo
                                                                              -ARR                 p:performed_at

                                                          p:music_director



          Song Data                           http://.../#ARR
                                                                                    Performances Data


© Canopus Consulting




                                                                                                                                    14
24-06-2010




         Third Query the data
        As if they are from the same source
        Answer questions that were not possible from either source
         alone
        For example
             What is the title of the song sung by AR Rehman at Fireflies Music
29            Festival
                            p:title            http://.../#JaiHo
              Jai Ho
                                 p:movie                                          p:sang

             Slumdog…                                                                               http://.../#FFM
                                       p:composer
                                                                            http://.../#JaiHo
                                                                                  -ARR          p:performed_at

                            http://.../#ARR
                                                         p:music_director

                        p:name

         AR Rehman                                    http://.../#ARR


© Canopus Consulting




         Fourth Adding Additional Knowledge
        We can assert that the composer & music_director are same
             p:composer sameAs p:music_director
        Very handy for model transformations, language translations, de-
         duplication etc.
                                      Now we can ask queries not possible with either of the sources
                                           “Composers” of songs played at Fireflies Music Festival

30


                            p:title            http://.../#JaiHo
              Jai Ho
                                 p:movie                                          p:sang

             Slumdog…                                                                               http://.../#ARR
                                       p:composer
                                                                            http://.../#JaiHo
                                                                                  -ARR          p:performed_at

                            http://.../#ARR
                                                         p:music_director

                        p:name

         AR Rehman                                    http://.../#ARR


© Canopus Consulting




                                                                                                                             15
24-06-2010




     Semantic Web Technologies

               Common Language




                                               Reasoning & Inferences




                                                                                   Assimilation & Retrieval
                                 RDF                                    RDFS                                  RDF DBs
                                                                        OWL                                   SPARQL


31




               RDF                          RDFS                               OWL                                SPARQL
      • Resource                        • RDF Schema -                     • Web Ontology                       • Protocol and
        Description                       Provides basic                     Language -                           RDF query
        Framework -                       elements for                       adds                                 language
        defines                           the                                semantics to
        structure to                      description of                     the schema
        triples.                          RDF
                                          vocabularies

© Canopus Consulting




     Expressing Knowledge
          Explicit Knowledge – Semantic Models - Ontology
                            Formal representation of the knowledge by a set of concepts
                             within a domain and the relationships between those concepts
                            It is used to reason about the properties of that domain
                            And may be used to describe the domain
32




© Canopus Consulting




                                                                                                                                        16
24-06-2010




     Mapping to Information Interchange




33




                                          Source: W3C
© Canopus Consulting




     Resource Descriptor Framework

34   WHAT IS RDF?




© Canopus Consulting




                                                               17
24-06-2010




     Paradigms of Information: Spec for RDF
          AAA - Anyone can say Anything about Anything
                Consistency is not a necessary condition
                For example, source 1 – Yaara Sili Sili is a sad song
                Source 2 – Yaara Sili Sili is a famous song
                Community may provide music theory attributes to the song

35         To each their own
                No common schema, yet the ability to make globally (valid) statements
                For example, information about raagas can be combined with information
                 on film music

           There is always one more
                Open world assumption - facts can and are always incrementally added
                For example, one source may classify Lalbagh as a tourist place
                Another source may classify Lalbagh as a garden
                Application understands that gardens are suitable for “morning walk”

© Canopus Consulting




     Sample Data

               Song name           Raaga       Sung by             Duration

               Nanu palimpa        Mohanam     Dr                  42 mins
                                               Balamuralikrishna
               Varuga varugave     Mohanam     MS                  8 mins
                                               Subbulakshmi
               Kallalo Kannulalo   Kalyani     Leela               6 mins
36
               Pranati Pranati     Kalyani     S P Balu            16 mins

               Piluvukara          Hindolam    Ghantasala          6.3 mins
               Alugukara




© Canopus Consulting




                                                                                                 18
24-06-2010




      Integrating Data from different sources
     Source 1
                       Pranati Pranati      Kalyani          S P Balu        16 mins




     Source 2
37                     Kallalo Kannulalo      Kalyani        Leela            6 mins




     Source 3           Varuga varugave       Mohanam        MS               8 mins
                                                             Subbulakshmi




      Data distribution row by row – all participants must agree to the common schema


© Canopus Consulting




      Interchange by Column
                                                                                               Source 2
          Source 1                                                   id     Sung by
               id         Song name               Raaga              1      Dr Balamuralikrishna
               1          Nanu palimpa            Mohanam            2      M S Subbulakshmi
               2          Varuga varugave         Mohanam            3      Leela
               3          Kallalo Kannulalo       Kalyani            4      S P Balu
38

               4          Pranati Pranati         Kalyani            5      Ghantasala

               5          Piluvukara              Hindolam
                          Alugukara




          Distributing by column
               Each participant must agree to the unique identifier

© Canopus Consulting




                                                                                                                 19
24-06-2010




         Interchange by Cell
                                                                               Source 1

                                                           Sung by

                                               Row 1       Dr Balamuralikrishna


             Source 2

39                           Song name                                       Source 3
                 Row 1       Nanu palimpa                            Raaga

                                                   Row 1             Mohanam




        Distributing by cell
            Each participant must agree to row ID & column name

© Canopus Consulting




         The Cell
            Cell by cell division allows us to do just that
                Row ID and column needs to be identified




40




         This is what RDF is -> TRIPLES of Data
         Subject –> Predicate -> Object
         Subject and Predicate have unique identifiers, object can be a literal or identifier
© Canopus Consulting




                                                                                                       20
24-06-2010




     RDF Expressions – Triples
          Triples - Subject, Predicate, Object
               Subject
                      Must be a Resource
               Predicate
41                    Must be a Resource
               Object
                      Can be a Resource or a Literal
                                                                                    Resource
          <subject> has a property <predicate>, whose value is <object>
          A labelled connection between two resources
          E.g. Song: Jai Ho has a property composer whose value is A R Rahman
          E.g. MelakartaRaaga has a property NumberOfSwaras whose value is 7


                                 Resource        Resource                 Literal
© Canopus Consulting




     Rules of RDF
          Global Uniqueness
               The RDF URI and names must be unique globally

          Sentence Form
               The order of knowledge representation in the sentence
42
                should not change
               So the autonomous agents can consume that metadata

          Reuse
               If a document refers to an existing resource, then it is
                talking about that same global resource



© Canopus Consulting




                                                                                                      21
24-06-2010




     Result
     Anybody can say             Every statement is an atomic RDF
     Anything about Anything     sentence


     To each his own             Every subject, object and predicate in RDF
                                 is qualified
43




     There is always one more RDF is a graph, no begin and no end,
                              An additional statement can always be
                              added to the graph




© Canopus Consulting




     More rdf
          RDF blank nodes and their usage
               Identify an abstract concept – “there exists some”
               My ideal friend
               Id is always local to the document, need not be the
44              same
          Reification
          RDF collections
               Bag – unordered collection
               Seq – ordered collection
               Alt – unordered set of equivalent alternatives

© Canopus Consulting




                                                                                     22
24-06-2010




     rdf:type
          Rdf:type is a property that provides an
           elementary typing system
               song:yaara-seeli-seeli rdf:type
                HindiSongs:FilmSong
45
                      Does not make any assumption that
                       HindiSongs:FilmSong is a class – it is a resource
                      Rdf does not have a definition for class




© Canopus Consulting




     Bringing in the meaning

46   WHAT IS RDFS




© Canopus Consulting




                                                                                  23
24-06-2010




     Need for RDF schemas
          First step towards the “extra knowledge”:
               define the terms we can use
               what restrictions apply
               what extra relationships are there?
47
          Officially: “RDF Vocabulary Description Language”
               the term “Schema” is retained for historical reasons…




                                                           Source: W3C
© Canopus Consulting




     Classes, Resources
          RDFS defines resources and classes:
               everything in RDF is a “resource”
               “classes” are also resources, but they are also a
                collection of possible resources (i.e., “individuals”)
48

          “composer”, “mood” are classes

          A R Rahman is an individual

          Love is an individual



© Canopus Consulting




                                                                                24
24-06-2010




     Classes, Resources (contd.)
          Relationships are defined among classes and
           resources:
               “typing”: an individual belongs to a specific class
               “«A R Rahman» is a composer”
49
               to be more precise: “«http://.../#A R Rahman» is a
                composer”
               “subclassing”: all instances of one are also the
                instances of the other (“every novel is a fiction”)
          RDFS formalizes these notions in RDF



© Canopus Consulting




     Classes, resources in RDF(S)
                                                                                     #artist



                                                                   rdfs:subClassOf



                http://perceptron.net/indianmusic#A R   rdf:type
                                                                         #composer
                               Rahman


50




          RDFS defines the meaning of these terms
          A resource may belong to several classes
               rdf:type is just a property…
          “«A R aRahman» is a composer and «composer» is an
           «artist»”
          The type information may be very important for applications
               e.g., it may be used for a categorization of possible nodes


© Canopus Consulting




                                                                                                      25
24-06-2010




     Inferred Properties
                                                                                       #artist



                                                                     rdfs:subClassOf



                http://perceptron.net/indianmusic#A R     rdf:type
                                                                           #composer
                               Rahman


51




          is not in the original RDF data
          Can be inferred from the RDFS rules
          RDFS environments return that triple, too


© Canopus Consulting




     Classes and Sub-Classes
          Challenge:
               We have instance data
                about SuddaSwaras and                   <owl:Thing rdf:about="#ga">
                                                          <rdf:type rdf:resource ="#SuddaSwara"/>
                VikritSwaras.                           </owl:Thing>

               We however often want                   <owl:Thing rdf:about="#ra">
52
                                                          <rdf:type rdf:resource ="#SuddaSwara"/>
                to use Swaras to mean                   </owl:Thing>
                both – for example, to
                                                        <owl:Thing rdf:about="#ni">
                say that some Raaga has                   <rdf:type rdf:resource ="#VikritSwara"/>
                                                        </owl:Thing>
                7 Swaras
               How do we state this?




© Canopus Consulting




                                                                                                            26
24-06-2010




     Classes and Sub classes
                 <rdfs:Class rdf:ID=“SuddaSwara">
                         <rdfs:subClassOf rdf:resource="#Swara"/>
                 </rdfs:Class>




          Class1 rdfs:subClassOf Class2
53
               Class1 is a specialization of Class2, membership in
                Class1 implies membership in Class2, properties of
                Class2 are inherited by Class1
               A class can be subClass of multiple classes
                 Class 1 subClassOf Class 2
                 Class 1 subClassOf Class 3

          Instances are specified using rdf:type
© Canopus Consulting




     Inference – formal rules
          The RDF Semantics document has a list of (33)
           entailment rules:
               “if such and such triples are in the graph, add this
                and this”
54             do that recursively until the graph does not change




© Canopus Consulting




                                                                              27
24-06-2010




     Properties and Sub Properties
          Properties and classes are defined separately
           from each other
               Property is not owned by any class
               Range and domain of properties can be specified
55                    What type of resources serve as object and subject


          Sub Property
               Rdfs:subPropertyOf is used to define one property
                as a sub-property of another
               The sub-property inherits the domain and range
                definitions of the property
© Canopus Consulting




     What does it look like




56


                           Data

                                                             Metadata




                                       Inferred Assertions


© Canopus Consulting




                                                                                   28
24-06-2010




     Properties , domains and ranges
            It is still rdf:Property not RDFS:Property, however RDFS adds the notion
             of a domain and range to an rdf:Property

             <rdf:Property rdf:ID=“directed_by">
                <rdfs:domain rdf:resource=“#MusicDirector"/>
                <rdfs:range rdf:resource="#FilmSongs"/>
57           </rdf:Property>
              Domain – what resources does the property apply to

              Range – what are the possible values



            If there are 2 domain or 2 range statements, it means both must be true
            Range can indicate:
                    Rdf:resource => either a resource or a literal
                    Rdfs:datatype



© Canopus Consulting




     Inference based on Domains and Ranges
        Challenge – data typing based on use
                We have statements about songs and who composed them such as
                     Endaro Mahanubhavulu isComposedBy Thyagaraja
                But we do not have a direct statement that says Thyagaraja, or any
                 one else is an instance of a class called Composers.
58
                Suppose we have to list all the composers in our model, what do we
                 do?


        This is a very common pattern in transformation rules




© Canopus Consulting




                                                                                               29
24-06-2010




     Inference based on Domains and Ranges

        Answer:
              Define the range for isComposedBy to be of class Composers
              The RDFS inference will automatically deduce that anything
               that is specified as the object of isComposedBy is an instance
               of class Composer
59


                                                                     Metadata


                                                                Inferred Assertion



                                                                     Data




                       <owl:ObjectProperty rdf:about="#isComposedBy">
                            <rdfs:range rdf:resource="#Composer"/>
                       </owl:ObjectProperty>

© Canopus Consulting




     Reinforcing the notion of “Schema”
          Domains and ranges are not used for
           validation - but instead are used to determine
           new information based on old information
          Does this surprise you?
60




© Canopus Consulting




                                                                                            30
24-06-2010




61   WHAT IS OWL




© Canopus Consulting




     Ontologies
          RDFS is useful, but does not solve all possible
           requirements
          Complex applications may want more
           possibilities:
62
               characterization of properties
               identification of objects with different URIs
               Disjoint-ness or equivalence of classes
               construct classes, not only name them
               can a program reason about some terms? E.g.:
                      “if «Person» resources «A» and «B» have the same
                       «email» property, then «A» and «B» are identical”

© Canopus Consulting




                                                                                  31
24-06-2010




     Ontologies (Cont.)
          The term ontologies is used in this respect:
               “defines the concepts and relationships used
                to describe and represent an area of
                knowledge”
63        RDFS can be considered as a simple ontology
           language
          Languages should be a compromise between
               rich semantics for meaningful applications
               feasibility, implementability



© Canopus Consulting




     OWL - Web Ontology Language
          OWL is an extra layer, a bit like RDF Schemas
               own namespace, own terms
               it relies on RDF Schemas
          It is a separate recommendation
64




© Canopus Consulting




                                                                      32
24-06-2010




     OWL Overview
          OWL is a large set of additional terms
          For classes:
               owl:equivalentClass: two classes have the same individuals
                      EXAMPLE – A:MISIC_DIRECTOR and B:COMPOSER
               owl:disjointWith: no individuals in common
           For properties:
65
     
               owl:equivalentProperty
                      EXAMPLE – A:VOCALS_BY and B:SINGER
               owl:propertyDisjointWith
          For individuals:
               owl:sameAs: two URIs refer to the same concept
                (“individual”)
               owl:differentFrom: negation of owl:sameAs
© Canopus Consulting




     Classes in OWL
          In RDFS, you can subclass existing classes…
           that’s all
          In OWL, you can construct classes from
           existing ones:
66
               enumerate its content
               through intersection, union, complement




© Canopus Consulting




                                                                                    33
24-06-2010




     OWL: Class
        OWL:Class is a subset of RDFS:Class
             More expressiveness – restrictions, set operations …

        Owl:Thing
             Every resource that is an instance of a class is
              automatically a member of OWL:Thing
67
             OWL:Nothing -> the empty class, most specialized

        Separation of classes and instances
             Though the language does not mandate it

        Classes contents can be enumerated
             The classes consists of exactly of those individuals

        Union of classes can be defined
             Other possibilities: complementOf, intersectionOf
© Canopus Consulting




     OWL class definition

         <owl:Class rdf:about="#Composer">
                 <rdfs:subClassOf rdf:resource="#Person"/>
         </owl:Class>


68
         <owl:Class rdf:about="#Composition">
                 <rdfs:subClassOf rdf:resource="&owl;Thing"/>
         </owl:Class>




© Canopus Consulting




                                                                            34
24-06-2010




     Equivalence in OWL
          Equivalent Class
               If Class A equivalentClassOf Class B
                      => they share the same members
                      If x belongs to Class A then x also belongs to Class B and vice
                       versa
69
               Is equivalent to saying:                    A (Model) Design Pattern:

                      A rdfs:subClassOf B                       How to implement
                                                            equivalence, when limited to
                      B rdfs:subClassOf A                       RDFS vocabulary

          Equivalent properties
               If propertyA equivalentPropertyOf propertyB
                      If propertyA applies between resources X and Y, then propertyB
                       also applies


© Canopus Consulting




     OWL - exhaustive
        The combination of class constructions with
         various restrictions is extremely powerful

        What we have so far follows same logic as before
              extend the basic RDF and RDFS possibilities with new
70
               features
              define their semantics, ie, what they “mean” in terms of
               relationships
              expect to infer new relationships based on those

        However, a full inference procedure is hard
              not implementable with simple rule engines, for example

© Canopus Consulting




                                                                                                  35
24-06-2010




     Properties
          owl:ObjectProperty is used to connect a resource to
           another resource
          owl:DatatypePropery is used to connect a resource to an
           rdfs:Literal (untyped) or an XML schema built-in data
           type (typed) value
71
          Both Can have sub-properties

                       <owl:ObjectProperty rdf:about="#isComposedBy">
                               <rdfs:range rdf:resource="#Composer"/>
                        </owl:ObjectProperty>

                       <owl:DatatypeProperty rdf:about="#hasName">
                               <rdfs:range rdf:resource="&xsd;string"/>
                        </owl:DatatypeProperty>



© Canopus Consulting




     More on properties
          Property can be the inverse of another
               Has child – has parent
               Usually seen in “containment” type
                                                                        Inverse Property
                associations
          Property can be symmetric ApB => BpA
72
               Knows, hasSpouse
               Usually abstract forms of more specialized
                properties,
               Often you will find that most symmetric
                properties are super-properties to others
          Property can be asymmetric
               hasMother, greaterThan, lesserThan
          Property can be transitive.                            Transitive & Symmetric
               isPartOf, contains
               Generally seen between entities of similar type
© Canopus Consulting




                                                                                                  36
24-06-2010




     More on properties
          Challenge
               We have the following statements:
                      Raaga A isJanyaRaagaOf Raaga B
                      Raaga C isJanyaRaagaOf Raaga D
73
                      Raaga E isMelakarthaDerivative of Raaga A
               We get additional information that there is
                something called JanakaRaaga such that if A is
                JanyaRaaga of B then B is JanakaRaaga of A
               I want to:
                      Get all JanakaRaaga of A
                      Get all Raagas related to A


© Canopus Consulting




     More on properties
          Answer
               isJanyaRaaga isInverseOf isJanakaRaaga
               isJanyaRaaga, isJanakaRaga, isMelakartaDerivative of
                are sub-properties of raagaRelations
74




                                                                    Entailed by OWL itself :

                                                                  inverseOf is inverse of itself
                                                                         In other words,
                                                                     inverseOf is symmetric




               It is not uncommon to find this pattern in model transformations

© Canopus Consulting




                                                                                                          37
24-06-2010




     Qualifying property membership
          Film songs are those songs that have appeared
           in films


               <owl:Class rdf:ID=“FilmSongs">
75                  <rdfs:subClassOf rdf:resource="#Songs"/>
                    <rdfs:subClassOf>
                             <owl:Restriction>
                                      <owl:onProperty rdf:resource="#appearedIn"/>
                                      <owl:someValuesFrom rdf:resource="#Films"/>
                             </owl:Restriction>
                    </rdfs:subClassOf>
               </owl:Class>




© Canopus Consulting




     Qualifying property membership
          owl:allValuesFrom
               If the property is used, then all values for the
                property must belong to the class
          owl:someValuesFrom
76
               If the property is used, at least one of the values
                must belong to the class
          owl:hasValue
               Of all the values a class has for a particular
                property, at least one must be this specific value



© Canopus Consulting




                                                                                            38
24-06-2010




     Owl:cardinality
     Likewise cardinality, and max cardinality

             <owl:Class rdf:ID=“FilmSongs">
             <rdfs:subClassOf rdf:resource="#Songs"/>
             <rdfs:subClassOf>
             <owl:Restriction>
                 <owl:onProperty rdf:resource="#appearedIn"/>
77
                       <owl:minCardinality rdf:datatype=“XMLSchema#nonNegativeInteger">
                                1
                       </owl:minCardinality>
                 </rdfs:subClassOf>
             </owl:Class>




© Canopus Consulting




     SPARQL Protocol And RDF Query Language

78   QUERYING RDF




© Canopus Consulting




                                                                                                 39
24-06-2010




     Querying RDF
          A collection of rdf statements is a graph
               No root - no start no finish
          What is returned to a query is a graph
               a relational query forms a new table by combining
79
                existing tables, an rdf query returns a new graph by
                combining information from graphs from multiple
                sources




© Canopus Consulting




     Queries and their responses
          Querying an RDF store such as the one we
           have just built will return exactly what is
           asserted
          Example:
80
               Raaga:Sreeragam rdf:type Raaga:Ghana
               Raaga:Ghana rdfs:SubClassOf Raaga:CarnaticRaaga

           A query such as ?x rdf:type Raaga:CarnaticRaaga
           Will return nothing !!



© Canopus Consulting




                                                                              40
24-06-2010




      Retrieval models from RDF

                                            Semantics
                                                                                   Semantic Query
                                   Meaning of data & relationships                 (e.g SPARQL)

     Internally represented as

81
                                             Structure                             Graph Traversal
                                                                                   (e.g. Squish)
                            A graph of triples connected to each other

         Serialized using


                                              Syntax                               Syntactic Query
                                                                                   (e.g XPath/Xquery)
                                 Formats such as XML/RDF or Turtle




© Canopus Consulting




      Syntactic Level Query
             <rdf:RDF
                        <Raaga rdf:about="#satmel">
                                 <rdf:type rdf:resource="&owl;Thing"/>
                                 <hasSwara rdf:resource="#ni"/>
                                 <hasSwara rdf:resource="#nu"/>
                                 <hasSwara rdf:resource="#pa"/>
                                 <hasSwara rdf:resource="#ra"/>
                                 <hasSwara rdf:resource="#sa"/>
                        </Raaga>
                        <owl:Thing rdf:about="#shuddhatodi">
82
                                 <rdf:type rdf:resource="#Raaga"/>
                                 <isMelakartaDerivative rdf:resource="#hanumatodi"/>
                        </owl:Thing>
                                                                     for $r in document(music.owl)
                                                                     where $r/Raaga/@rdf:about = ‘#satmel’
             Select a raga – (Xpath/XQuery)                         return $r/Raaga/hasSwara

                 /RDF/Raga/@rdf:about=“#satmel”
                 /RDF/Thing/@rdf:about=“#shuddhatodi”
             Limitations of this approach
                 Can go out of hand very quickly
                 Does not understand the semantics of RDFS & OWL
                 Tied to the structure of RDF (which can be expressed in many ways)
© Canopus Consulting




                                                                                                                    41
24-06-2010




     Structural Query
             Subject                        Predicate                  Object

             perceptron:satmel              rdf:type                   perceptron:Raaga

             perceptron:shuddhatoodi        rdf:type                   perceptron:MelakarthaRaaga

             perceptron:satmel              perceptron:hasSwara        perceptron:sa

             perceptron:MelakarthaRaaaga    owl:subClassOf             perceptron:Raaga

             …
83


                Possible queries
                     Select * from Triples where Object = “owl:Raaga” and Predicate = “rdf:type”
                     Select Predicate from Triples where Subject = “perceptron:satmel”
                Limitations of this approach
                     Interprets any RDF model as just a set of Triples
                     Does not understand the semantics of RDFS & OWL
                     E.g. Looking for all raagas will fail here since shuddhatoodi is asserted to
                      be a MelakarthaRaaga, while it is a subClassOf Raaga is in the semantics of
                      the next triple
© Canopus Consulting




     Querying at the Semantic Level
            Need a new language that can understand the semantics of RDFS & OWL

            Sample Queries
                    Select ?x from <perceptron.net> where ?x <rdf:type> Raaga
                    All MelakarthaRaagas will also be returned even though they are not explicitly
                     asserted to be a Raaga

84          Can draw inferences from the rules of RDFS and OWL
                    Either computing and storing the closure of a given model
                    Or Infer new statements as needed by the query on the fly

            Not tied to how the data is stored or serialized

            Special purpose query languages designed to facilitate this
                    RQL, SPARQL, TQL
                    SPARQL has emerged as the industry standard



© Canopus Consulting




                                                                                                             42
24-06-2010




     Introducing SPARQL
          Designed to query collections of triples…
          …and to easily traverse relationships
          SQL-like syntax (SELECT, WHERE)
          Matches graph patterns
85




© Canopus Consulting




     SPARQL – Key Characteristics
          Graph pattern matching ability
               Capability to restrict matches on a queried graph by providing a graph pattern
               Which consists of one or more RDF triple patterns, to be satisfied in a query

          Variable binding results
               Returns zero or more bindings of variables.
               Each set of bindings is one way that the query can be satisfied by the queried graph

86        Sub-graph results
               It must be possible for query results to be returned as a sub-graph of the original
                graph

          Result limits
               Possible to specify an upper bound on the number of query results returned

          Streaming results
               Possible for the client to request that results be streamed

          WSDL support
               The protocol – including its interfaces, their operations, results, and types are
                described using WSDL

© Canopus Consulting




                                                                                                              43
24-06-2010




     Sample Model Fragment




87




     PREFIX perceptron: http://www.perceptronnetwork.com/ontologies/2010/...

     SELECT ?raaga
     WHERE {?raaga rdf:type perceptron:Raaga }

© Canopus Consulting




     Let us take a closer look
         PREFIX perceptron: http://www.perceptronnetwork.com/ontologies/2010/...

         SELECT ?raaga
         WHERE {?raaga rdf:type perceptron:Raaga }


          PREFIX
              Defines an alias for the namespace
88
          SELECT
              Select query
          FROM
              Optional clause, specifies the URI of the model
          Variables
              Marked by either ? or $
          WHERE
              Usually the most significant part of a SPARQL query
              Each triple is one condition (filter on the graph) (expressed using Turtle Syntax)

© Canopus Consulting




                                                                                                           44
24-06-2010




     SPARQL Select
                Challenge
                     Number of Raagas is huge, paginate the results

     PREFIX perceptron: <someURI>
     PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

     SELECT DISTINCT ?s FROM <rmi://localhost/server1#perceptron>
89   WHERE {?s rdf:type perceptron:Raaga}
     LIMIT 2 OFFSET 2
     ORDER BY DESC(?s)



                DISTINCT – remove duplicate results
                LIMIT n – Limit the returned values to n rows
                OFFSET n – Offset the first n nodes
                  Limit and Offset together can be used to paginate the results

                ORDER BY – To sort the results, can be ASC, DESC

© Canopus Consulting




     Example of a Complex Query
            Challenge
                    Search for the top 10 songs (by popularity) in my favorite Raaga

                  SELECT ?rendition

                  FROM <rmi://localhost/server1#perceptron>
                  FROM NAMED <my-preferences.rdf>
                  FROM NAMED <popularity.rdf>
90
                  WHERE {
                             GRAPH <my-preferences.rdf> {
                                      ?me dc:name “My Name" .
                                      ?me prefs:favouriteRaga ?fav_raga .
                             } .

                             ?composition perceptron:hasRaaga ?fav_raga .
                             ?rendition perceptron:hasComposition ?composition .

                             GRAPH <my-preferences.rdf> {
                                      ?rendition pop:popularity ?popularity
                             }
                  }
                  ORDER BY DESC[?popularity] LIMIT 10

© Canopus Consulting




                                                                                               45
24-06-2010




     How this Query Would Work
                                                      My-favourites.rdf
                                                                                              perceptron.owl

                                 “My Name”

                       dc:name                                                              Mohanam               ?fav_raga



                   Person:XXX
                                   prefs:favouriteRaaga                                          perceptron:hasRaaga
91
            ?me


                                                                                     Chandracharita
                                                                                                               ?composition



                                          pop:popularity
                       1                                                              perceptron:isRenditionOf
                       1
                       1


                ?popularity                                                   rend-
                                                                          chandracharita-
                                 popularity.rdf                                bmk                 ?rendition




© Canopus Consulting




     Additional SPARQL constructs
          INSERT
               INSERT DATA { triples block }
          DELETE
               DELETE DATA { triples block }
          OPTIONAL
               Used to define an optional condition
92
          UNION
               For alternate queries
          FILTER
               Value based filters
          ASK
               Returns true or false depending upon the condition
          DESCRIBE
               Returns the node graph for the given condition


© Canopus Consulting




                                                                                                                                     46
24-06-2010




     Technologies & Tools

93   SEMANTIC WEB INFRASTRUCTURE




© Canopus Consulting




     Objective
          How do we build an application?
          How do we build the ontology?
          What are the key architecture components?
          What are the tools & technologies to use?
           How do I choose which technology to use?
94
     




© Canopus Consulting




                                                              47
24-06-2010




       Semantic Web Application Lifecycle
           Information Modelling
                Build Ontology (model level representation)

           Information Assimilation
                Populate Knowledgebase from various sources
95
                Including current applications
                Automatic Semantic Annotation of existing data
                Any type of document, multiple sources of documents

           Information Retrieval
                Applications: search, integrate/portal, summarize/
                 explain, analyse, decisions support
                Reasoning techniques: graph analysis, inferencing
© Canopus Consulting




       Semantic Web Application Lifecycle

                                               Ontology Editors:
                                               Protégé, TopBraid
                                                  Composer




96               Build Information Model




                                           Semantic Query
                                              Server

        Refine/Evolve
     Information Model
                                                                    Create Assimilation Models
                                                                     & Aggregate knowledge
              RDF Stores:
             Mulgara, Sesame                                               Technologies:
                                                                          GRDDL, RDFizers,
           Programming: Jena                                              OWLs, Automatic
                                                                             Annotation


                                   Retrieve and Use Semantic Data
© Canopus Consulting




                                                                                                        48
24-06-2010




     Information Modeling
          Information Model Consists of
               Description Component – Schema
                      Designed by domain experts, community
               Description Base – Assertions, Extensions
                      Automated agents who assimilate the information

97
          The model is an evolutionary process
               Start with a concept map or a taxonomy
               Leading up to a formal ontology
               It evolves over lifetime of the application

          Tools & Technologies
               Popular modeling tools – Protégé, TopBraid Composer, GrOWL
               Concept Mapping – CMAP Tools, CMAP Tools COE
               http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html
© Canopus Consulting




     RDF Stores, Triple Stores

98   PERSISTING RDF DATA




© Canopus Consulting




                                                                                    49
24-06-2010




     Storage Models for RDF
                                                                     Hand-crafted SQL,
          SQL Storage                                                     ORM

                Direct SQL access to RDF data
                Queries become complex as discussed earlier
                Can’t completely deal with the semantics of data


99        SQL Bridge                                                Tools like – SDB,
                                                                       Squirrel RDF
                Access via SPARQL interface
                Adaptor transforms SPARQL query into SQL
                Data stored in relational schema


          Native RDF                                               Native non-relational
                                                                       DB – such as
                Access via SPARQL interface                              Mulgara

                Data stored in native RDF databases


© Canopus Consulting




     Storage Models for RDF
           SQL Storage                        SQL Bridge             Native RDF



               Application                     Application             Application




10
 0
                                                SPARQL                  SPARQL
            SQL Queries
                                                Queries                 Queries




                                                Adaptor




     Relational        Relational              Relational               Native RDF
      Schema            Triples                 Triples                   Graph



© Canopus Consulting




                                                                                                   50
24-06-2010




     Storage Models for RDF
           SQL Storage                     SQL Bridge                Native RDF



             Application                   Application                Application
                           Traditional                     RDF
                           approach                      Databases


10
 1
                                            SPARQL                     SPARQL
            SQL Queries
                                            Queries                    Queries




                                             Adaptor




     Relational        Relational           Relational                Native RDF
      Schema            Triples              Triples                    Graph



© Canopus Consulting




     Storing RDF Data – Relational Model
    Relational Schema
         Data model consisting of well defined tables, columns and their meaning
         To provide flexibility, we would have to use techniques that dynamically
          update the schema
         Still makes each row alike – whereas the fundamental premise of RDF is
          that each row is potentially unique
10
 2




© Canopus Consulting




                                                                                            51
24-06-2010




         Storing RDF Data – Triple forms
         Storing RDF predicates in a relational database
              Use the attributes as “extended properties”
              Simple triple forms (subject – predicate – object)
              Predicate tables (one table per predicate)
         Gives the requisite flexibility but makes indexing & retrieval difficult

10
 3




© Canopus Consulting




         Storing RDF Data –RDF Stores
        RDF Stores
             Provide a mechanism to store RDF data
             Provide a mechanism to query it using languages such as SPARQL
             Internally may or may not be based on relational databases
             Also known as Graph Databases or Schema-less Databases
10
                 Although not all Graph databases support RDF
 4




                                            Jena SDB, TDB                       Oracle 11g RDF Database

         Mulgara Semantic Store




                                                            OpenLink Virtuoso
                                                                                               AllegroGraph
                           OpenRDF Sesame

© Canopus Consulting




                                                                                                                     52
24-06-2010




     Programming APIs, Technology Stack
10
 5
     PROGRAMMING FOR THE SEMANTIC WEB




© Canopus Consulting




     Let us Revisit SPARQL
          SPARQL
               SPARQL Protocol and RDF Query Language
               It is both a Protocol and a Query Language

          Protocol Definition
10
 6
               Defines a mechanism of invoking a query over the web
               Equivalent to a web service definition
               Defines one operation
                      Query - with the input, output and fault definitions
               Defines the protocol bindings
                      HTTP (get/post)
                      SOAP (web service)
                      An implementation may provide ANY of the above two
© Canopus Consulting




                                                                                     53
24-06-2010




      Architecture Stack of Semantic Technologies

                  Application
                                                                   HTTP                     SOAP


                                                                          Programming API

             Semantic Middleware
10            e.g. Semantic SOA                          SPARQL Processor            Inference Engine
 7



                                                           RDF-SQL
                                                            Adaptor




                                                                     Relational
                                                                                   RDF Store
                                                                       Store


                                                                    Semantic Technology Stack



© Canopus Consulting




      RDF Programming Stack - Jena
        The most popular stack for Java is Jena (http://jena.sourceforge.net)
             Jena also has an in-memory graph manipulation API
             Jena APIs for RDF, RDFS, OWL is one of the most popular APIs
             It uses a Graph based model and pluggable architecture
             It allows plugging various storages, reasoners etc. to the API
10           Many RDF stores either have or are building support for Jena API
 8




     Jena uses a Graph model as the core
     All reasoners also assert inferences into a Graph
     Thus the output of a reasoner can be fed into another layer
     It is a common pattern with Jena to do such layering




© Canopus Consulting




                                                                                                               54
24-06-2010




     Components of Jena Stack

            Joseki                     HTTP                     SOAP

                                                                                   Jena API
                                              Programming API
              ARQ

10                               SPARQL Processor        Inference Engine
 9



                                  RDF-SQL
                                   Adaptor

                                                                                     TDB
            SDB
         (Abstraction)
                                          Relational
                                                       RDF Store
                                            Store


                                         Semantic Technology Stack



© Canopus Consulting




     Connecting to RDF Database – RDF Programming APIs
         RDF programming APIs are available in almost all major languages
               C, C++, C# and .Net, Haskell, Java, Javascript, Lisp,
               Obj-C, PHP, Perl, Prolog, Python, Ruby, Tcl/Tk

         There is no standard client API (yet!) - equivalent of JDBC
               Each RDF package has its own set of APIs
11             Most often they follow similar paradigms of Connections, Factories, Sessions
 0



         The query syntax however is standardized
               Thus queries are portable across RDF stores
               At the same time, many packages have their own “dialects” of RDF query
                languages

         Mulgara is one of the most popular open source RDF databases
               Has its own API to connect to the RDF Database
               In addition to SPARQL, has its own dialect (called iTQL)


© Canopus Consulting




                                                                                                      55
24-06-2010




     Connecting to Mulgara Database
       URI SERVER_URI =         URI.create("rmi://servername/instancename");
       URI GRAPH_URI        =   URI.create("rmi://servername/instancename#model");
       String query = "SELECT ?x WHERE { ?x        <p:isDerivedFrom> <p:Kalyaani> }";

       try {
                       // Creating a new connection from the factory using the server URI
                       ConnectionFactory factory = new ConnectionFactory();
                       Connection connection = factory.newConnection(SERVER_URI);
11                     // Initialize the SPARQL interpreter with the graph URI
 1                     SparqlInterpreter interpreter = new SparqlInterpreter();
                       interpreter.setDefaultGraphUri(GRAPH_URI);
                       // Parse and execute the query on the connection
                       Query query = interpreter.parseQuery(queryStr);
                       Answer a = connection.execute(query);
                 // Use the results
                 RdfXmlEmitter.writeRdfXml((GraphAnswer) a, System.out);                 Sample Java
       }                                                                                code to connect
       catch (Exception e) {                                                              to Mulgara
                 e.printStackTrace();                                                   Database using
       }                                                                                 Mulgara Client
       finally {                                                                              API
                 connection.dispose();
       }

© Canopus Consulting




     Adopting Existing SQL Databases




11
 2




                                                                          Source: Tim Berners-Lee


© Canopus Consulting




                                                                                                                 56
24-06-2010




     Semantic Web Technologies




11
 3




                                 Source: W3C


© Canopus Consulting




11
 4
     APPENDIX




© Canopus Consulting




                                                      57
24-06-2010




     RDF Triples
          Resources can use any URI, e.g.:
               http://www.example.org/file.xml#element(home)
               http://www.example.org/file.html#home
               http://www.example.org/file2.xml#xpath1(//q[@a=b])
11
 5
          URI-s can also denote non Web entities:
               http://www.ivan-herman.net/me is me
               not my home page, not my publication list, but me
          RDF triples form a directed, labelled graph




© Canopus Consulting




     RDF Primitives
          Resources
               Anything that can be uniquely identified
               Using a URI
               E.g. http://www.perceptron.net/indianmusic#A R Rahman

11
                                       Namespace        Fragment ID
 6

          Properties
               Are resources in themselves


          Literals
                                                               RDF is a
               Strings                                        general
                                                               model for
               With optional data types                        triples




© Canopus Consulting




                                                                                  58
24-06-2010




     OWL Full
          No constraints on any of the constructs
               owl:Class is just syntactic sugar for rdfs:Class
               owl:Thing is equivalent to rdfs:Resource
               this means that:
11
 7                    Class can also be an individual, a URI can denote a
                       property as well as a Class
                          e.g., it is possible to talk about class of classes, apply properties
                           on them

          Extension of RDFS in all respects
          But: no system may exist that infers
           everything one might expect

© Canopus Consulting




     OWL Full usage
        Nevertheless OWL Full is essential
              it gives a generic framework to express many things
              some application just need to express and interchange
               terms
11
 8      Applications may control what terms are used
         and how
              in fact, they may define their own sub-language via,
               eg, a vocabulary
              thereby ensuring a manageable inference procedure



© Canopus Consulting




                                                                                                          59

More Related Content

Similar to Web3.0 seminar wipro-session2-logicalontological

Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
R A Akerkar
 

Similar to Web3.0 seminar wipro-session2-logicalontological (20)

Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal
 
Acquia Business Mandate Deck Final
Acquia Business Mandate Deck FinalAcquia Business Mandate Deck Final
Acquia Business Mandate Deck Final
 
Advanced Knowledge Technologies (AKT) -highlights 2006
Advanced Knowledge Technologies (AKT) -highlights 2006Advanced Knowledge Technologies (AKT) -highlights 2006
Advanced Knowledge Technologies (AKT) -highlights 2006
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...Mending the Gap between Library's Electronic and Print Collections in ILS and...
Mending the Gap between Library's Electronic and Print Collections in ILS and...
 
Context culture metadata_openscout20120301
Context culture metadata_openscout20120301Context culture metadata_openscout20120301
Context culture metadata_openscout20120301
 
Discovery & Reuse of Content
Discovery & Reuse of ContentDiscovery & Reuse of Content
Discovery & Reuse of Content
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
 
Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic WebDataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
 
Semic 2
Semic 2Semic 2
Semic 2
 
Federated metadata
Federated metadataFederated metadata
Federated metadata
 
If we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote GobleIf we build it will they come? BOSC2012 Keynote Goble
If we build it will they come? BOSC2012 Keynote Goble
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 

Web3.0 seminar wipro-session2-logicalontological

  • 1. 24-06-2010 Web 3.0, Semantics & Session II – Enterprise Computing Logical Ontological Satish, Sukumar, Feroz Sheikh & Venkatesh S www.canopusconsulting.com June 2010 Objective  Information interchange and modelling  An introduction to semantic web technologies  RDF, RDFS, OWL 2  Semantic modelling  Querying  The web 3.0 technology stack © Canopus Consulting 1
  • 2. 24-06-2010 The following Sessions will address:  How do we build an application?  How do we build the ontology?  What are the key architecture components?  What are the tools & technologies to use? How do I choose which technology to use? 3  © Canopus Consulting Semantic Web Application Lifecycle Ontology Editors: Protégé, TopBraid Composer 4 Build Information Model Semantic Query Server Refine/Evolve Information Model Create Assimilation Models & Aggregate knowledge RDF Stores: Mulgara, Sesame Technologies: GRDDL, RDFizers, Programming: Jena OWLs, Automatic Annotation Retrieve and Use Semantic Data © Canopus Consulting 2
  • 3. 24-06-2010 Semantic Web Application Lifecycle  Information Modelling  Build Ontology (model level representation)  Information Assimilation  Populate Knowledgebase from various sources 5  Including current applications  Automatic Semantic Annotation of existing data  Any type of document, multiple sources of documents  Information Retrieval  Applications: search, integrate/portal, summarize/ explain, analyse, decisions support  Reasoning techniques: graph analysis, inferencing © Canopus Consulting Architecture Stack of Semantic Technologies Application HTTP SOAP Programming API Semantic Middleware 6 e.g. Semantic SOA SPARQL Processor Inference Engine RDF-SQL Adaptor Relational RDF Store Store Semantic Technology Stack © Canopus Consulting 3
  • 4. 24-06-2010 Semantic Web Technologies 7 Source: W3C © Canopus Consulting The Perceptron.Net Use case  A rich Cultural Informatics environment designed to  Create, Collect, Categorize any type of cultural artifact – Music, Literature, Travel, Leisure, 8 Entertainment..  Communities can be formed around content  Make use of existing information on the network and existing community infrastructure  An example:  Indian Music cannot be categorized along the same lines as Western Music  Genre, Album, Artist – is just not sufficient… © Canopus Consulting 4
  • 5. 24-06-2010 The Perceptron.Net use case…  Typical Queries we want to support:  Thematic Album Creation Ability:  Give me all songs that are directed by X, and music composed by “y” and hero was “z” 9  Give me all songs in Raga Kalyani – (must include film, folk and classical songs)  Give me all songs in Lord Rama in Sanskrit, which are “stotras”…  Give me all the recordings of live performacnes at Sri Krishna Gana Sabha, Chennai © Canopus Consulting The Perceptron.Net use case…  Provide an exploratory interface:  Specify a generic criteria and successively filter until you find what you need. E.g: specify a “mood” or a song you like and ask for “similar” songs or songs that match such a mood.  Allow community to add content, meta-data and find new connections in the content. 10  Content can be anywhere on the Internet  Raaga.com, HamaraCd.com, MusicToday.Com, Orkut groups, blogs, websites  Not only music, but include content “about” music – articles, essays, ratings, discussions – which should be used in connecting the content, in searching the content, in enriching the content  Provide feeds such that facebook type plug-in can be developed easily – so that content and queries can be shared/updated from anywhere. © Canopus Consulting 5
  • 6. 24-06-2010 11 © Canopus Consulting Practical Problem 12 LET US CREATE A COMPILATION OF AMITABH BACHAN’S FILM SONGS © Canopus Consulting 6
  • 7. 24-06-2010 We start with a music site Or HamaraCD or India Times Shopping or … And we go through a fairly exhaustive categorization … 13 By actor By album name Release year © Canopus Consulting Or go to a site with a filmography And drill down to some of our favorite ones 14 © Canopus Consulting 7
  • 8. 24-06-2010 But what if we wanted to  Get all songs for Amitabh Bachchan that  Are from movies directed by Yash Chopra? 15 © Canopus Consulting Or what if we wanted to Get all songs of Amitabh Bachan in which the playback singer won an award. 16 © Canopus Consulting 8
  • 9. 24-06-2010 We could do this  By hand  Make a list of movies from the filmography site  Go to Raaga and get songs for those movies  But what if there are multiple sources? 17 © Canopus Consulting Across Multiple Sources 18 © Canopus Consulting 9
  • 10. 24-06-2010 The problem  Implicit to Explicit  Logic & domain knowledge is “Implicit” in an application  In Code, DB Schema, Documentation etc.  Humans can interpret it, but automated agents can’t List<SongList>: getSongsInAlbum(); 19 SongID Sequence Duration Yaara Seeli 1 4:50 Seeli Dekha Ek 2 2:55 Khwab toh … … … For agents, this logic needs to be “Explicitly” stated © Canopus Consulting To enable agents,  Extract the Logic as Metadata  The logic & domain knowledge is embedded in the application  In object models, class hierarchies, instance names  This needs to be extracted out, for others to link, access and use  Deliver both data and metadata in a uniform manner 20 © Canopus Consulting 10
  • 11. 24-06-2010 We need  Common Language  Agents need a common language to interchange information  Derive Conclusions  Logic, that is now explicitly stated, can be used to draw conclusions 21  Allow Independent Evolution  Every application should be able to evolve independently  Both at data level (new data) or schema level (new knowledge)  Support Incremental Assimilation  Different source can contain/provide different aspects of knowledge  Community participation to evolve the knowledge  Manifests itself in the principle of AAA (Anyone can say anything about any topic) © Canopus Consulting Steps In Information Interchange 1. Map the various data onto an abstract data representation  Make the data independent of its internal representation… 22  Expose it in a standard format 2. Merge the resulting representations to create a single model 3. Make queries on the whole  Queries that could not have been done on the individual data sets © Canopus Consulting 11
  • 12. 24-06-2010 Steps In Information Interchange 23 © Canopus Consulting Enabling Technologies  Semantic technologies (specifically RDF) treat metadata as data and exchange both in exactly the same way.  They provide a way for anyone to make a basic 24 statement about anything and a means of layering these statements into a model. © Canopus Consulting 12
  • 13. 24-06-2010 Interchange Without APIs  If the information provider can now produce an RDF stream of data and metadata, the assimilation agent can combine streams from all sources and treat them as if they are from 25 the same source  The only thing the agent needs to understand the model of RDF, and not the individual application APIs and models  This is the fundamental premise of information interchange on the web 3.0. © Canopus Consulting First Expose the Data  As a set of relations  Using RDF Jai Ho p:title http://.../#JaiHo p:movie Slumdog… p:composer 26 http://.../#ARR p:name AR Rehman Song Data © Canopus Consulting 13
  • 14. 24-06-2010 Second Merge Data from different Sources p:title Same URI = Same URI = http://.../#JaiHo Jai Ho Same Same p:movie Resource Resource Slumdog… p:composer 27 http://.../#JaiHo http://.../#ARR p:sang p:name http://.../#ARR AR Rehman http://.../#JaiHo -ARR p:performed_at p:music_director Song Data http://.../#ARR Performances Data © Canopus Consulting Merging Identical Resources p:title http://.../#JaiHo Jai Ho p:movie Slumdog… p:composer 28 http://.../#JaiHo http://.../#ARR p:sang p:name http://.../#ARR AR Rehman http://.../#JaiHo -ARR p:performed_at p:music_director Song Data http://.../#ARR Performances Data © Canopus Consulting 14
  • 15. 24-06-2010 Third Query the data  As if they are from the same source  Answer questions that were not possible from either source alone  For example  What is the title of the song sung by AR Rehman at Fireflies Music 29 Festival p:title http://.../#JaiHo Jai Ho p:movie p:sang Slumdog… http://.../#FFM p:composer http://.../#JaiHo -ARR p:performed_at http://.../#ARR p:music_director p:name AR Rehman http://.../#ARR © Canopus Consulting Fourth Adding Additional Knowledge  We can assert that the composer & music_director are same  p:composer sameAs p:music_director  Very handy for model transformations, language translations, de- duplication etc. Now we can ask queries not possible with either of the sources “Composers” of songs played at Fireflies Music Festival 30 p:title http://.../#JaiHo Jai Ho p:movie p:sang Slumdog… http://.../#ARR p:composer http://.../#JaiHo -ARR p:performed_at http://.../#ARR p:music_director p:name AR Rehman http://.../#ARR © Canopus Consulting 15
  • 16. 24-06-2010 Semantic Web Technologies Common Language Reasoning & Inferences Assimilation & Retrieval RDF RDFS RDF DBs OWL SPARQL 31 RDF RDFS OWL SPARQL • Resource • RDF Schema - • Web Ontology • Protocol and Description Provides basic Language - RDF query Framework - elements for adds language defines the semantics to structure to description of the schema triples. RDF vocabularies © Canopus Consulting Expressing Knowledge  Explicit Knowledge – Semantic Models - Ontology  Formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts  It is used to reason about the properties of that domain  And may be used to describe the domain 32 © Canopus Consulting 16
  • 17. 24-06-2010 Mapping to Information Interchange 33 Source: W3C © Canopus Consulting Resource Descriptor Framework 34 WHAT IS RDF? © Canopus Consulting 17
  • 18. 24-06-2010 Paradigms of Information: Spec for RDF  AAA - Anyone can say Anything about Anything  Consistency is not a necessary condition  For example, source 1 – Yaara Sili Sili is a sad song  Source 2 – Yaara Sili Sili is a famous song  Community may provide music theory attributes to the song 35  To each their own  No common schema, yet the ability to make globally (valid) statements  For example, information about raagas can be combined with information on film music  There is always one more  Open world assumption - facts can and are always incrementally added  For example, one source may classify Lalbagh as a tourist place  Another source may classify Lalbagh as a garden  Application understands that gardens are suitable for “morning walk” © Canopus Consulting Sample Data Song name Raaga Sung by Duration Nanu palimpa Mohanam Dr 42 mins Balamuralikrishna Varuga varugave Mohanam MS 8 mins Subbulakshmi Kallalo Kannulalo Kalyani Leela 6 mins 36 Pranati Pranati Kalyani S P Balu 16 mins Piluvukara Hindolam Ghantasala 6.3 mins Alugukara © Canopus Consulting 18
  • 19. 24-06-2010 Integrating Data from different sources Source 1 Pranati Pranati Kalyani S P Balu 16 mins Source 2 37 Kallalo Kannulalo Kalyani Leela 6 mins Source 3 Varuga varugave Mohanam MS 8 mins Subbulakshmi Data distribution row by row – all participants must agree to the common schema © Canopus Consulting Interchange by Column Source 2 Source 1 id Sung by id Song name Raaga 1 Dr Balamuralikrishna 1 Nanu palimpa Mohanam 2 M S Subbulakshmi 2 Varuga varugave Mohanam 3 Leela 3 Kallalo Kannulalo Kalyani 4 S P Balu 38 4 Pranati Pranati Kalyani 5 Ghantasala 5 Piluvukara Hindolam Alugukara  Distributing by column  Each participant must agree to the unique identifier © Canopus Consulting 19
  • 20. 24-06-2010 Interchange by Cell Source 1 Sung by Row 1 Dr Balamuralikrishna Source 2 39 Song name Source 3 Row 1 Nanu palimpa Raaga Row 1 Mohanam  Distributing by cell  Each participant must agree to row ID & column name © Canopus Consulting The Cell  Cell by cell division allows us to do just that  Row ID and column needs to be identified 40 This is what RDF is -> TRIPLES of Data Subject –> Predicate -> Object Subject and Predicate have unique identifiers, object can be a literal or identifier © Canopus Consulting 20
  • 21. 24-06-2010 RDF Expressions – Triples  Triples - Subject, Predicate, Object  Subject  Must be a Resource  Predicate 41  Must be a Resource  Object  Can be a Resource or a Literal Resource  <subject> has a property <predicate>, whose value is <object>  A labelled connection between two resources  E.g. Song: Jai Ho has a property composer whose value is A R Rahman  E.g. MelakartaRaaga has a property NumberOfSwaras whose value is 7 Resource Resource Literal © Canopus Consulting Rules of RDF  Global Uniqueness  The RDF URI and names must be unique globally  Sentence Form  The order of knowledge representation in the sentence 42 should not change  So the autonomous agents can consume that metadata  Reuse  If a document refers to an existing resource, then it is talking about that same global resource © Canopus Consulting 21
  • 22. 24-06-2010 Result Anybody can say Every statement is an atomic RDF Anything about Anything sentence To each his own Every subject, object and predicate in RDF is qualified 43 There is always one more RDF is a graph, no begin and no end, An additional statement can always be added to the graph © Canopus Consulting More rdf  RDF blank nodes and their usage  Identify an abstract concept – “there exists some”  My ideal friend  Id is always local to the document, need not be the 44 same  Reification  RDF collections  Bag – unordered collection  Seq – ordered collection  Alt – unordered set of equivalent alternatives © Canopus Consulting 22
  • 23. 24-06-2010 rdf:type  Rdf:type is a property that provides an elementary typing system  song:yaara-seeli-seeli rdf:type HindiSongs:FilmSong 45  Does not make any assumption that HindiSongs:FilmSong is a class – it is a resource  Rdf does not have a definition for class © Canopus Consulting Bringing in the meaning 46 WHAT IS RDFS © Canopus Consulting 23
  • 24. 24-06-2010 Need for RDF schemas  First step towards the “extra knowledge”:  define the terms we can use  what restrictions apply  what extra relationships are there? 47  Officially: “RDF Vocabulary Description Language”  the term “Schema” is retained for historical reasons… Source: W3C © Canopus Consulting Classes, Resources  RDFS defines resources and classes:  everything in RDF is a “resource”  “classes” are also resources, but they are also a collection of possible resources (i.e., “individuals”) 48  “composer”, “mood” are classes  A R Rahman is an individual  Love is an individual © Canopus Consulting 24
  • 25. 24-06-2010 Classes, Resources (contd.)  Relationships are defined among classes and resources:  “typing”: an individual belongs to a specific class  “«A R Rahman» is a composer” 49  to be more precise: “«http://.../#A R Rahman» is a composer”  “subclassing”: all instances of one are also the instances of the other (“every novel is a fiction”)  RDFS formalizes these notions in RDF © Canopus Consulting Classes, resources in RDF(S) #artist rdfs:subClassOf http://perceptron.net/indianmusic#A R rdf:type #composer Rahman 50  RDFS defines the meaning of these terms  A resource may belong to several classes  rdf:type is just a property…  “«A R aRahman» is a composer and «composer» is an «artist»”  The type information may be very important for applications  e.g., it may be used for a categorization of possible nodes © Canopus Consulting 25
  • 26. 24-06-2010 Inferred Properties #artist rdfs:subClassOf http://perceptron.net/indianmusic#A R rdf:type #composer Rahman 51  is not in the original RDF data  Can be inferred from the RDFS rules  RDFS environments return that triple, too © Canopus Consulting Classes and Sub-Classes  Challenge:  We have instance data about SuddaSwaras and <owl:Thing rdf:about="#ga"> <rdf:type rdf:resource ="#SuddaSwara"/> VikritSwaras. </owl:Thing>  We however often want <owl:Thing rdf:about="#ra"> 52 <rdf:type rdf:resource ="#SuddaSwara"/> to use Swaras to mean </owl:Thing> both – for example, to <owl:Thing rdf:about="#ni"> say that some Raaga has <rdf:type rdf:resource ="#VikritSwara"/> </owl:Thing> 7 Swaras  How do we state this? © Canopus Consulting 26
  • 27. 24-06-2010 Classes and Sub classes <rdfs:Class rdf:ID=“SuddaSwara"> <rdfs:subClassOf rdf:resource="#Swara"/> </rdfs:Class>  Class1 rdfs:subClassOf Class2 53  Class1 is a specialization of Class2, membership in Class1 implies membership in Class2, properties of Class2 are inherited by Class1  A class can be subClass of multiple classes Class 1 subClassOf Class 2 Class 1 subClassOf Class 3  Instances are specified using rdf:type © Canopus Consulting Inference – formal rules  The RDF Semantics document has a list of (33) entailment rules:  “if such and such triples are in the graph, add this and this” 54  do that recursively until the graph does not change © Canopus Consulting 27
  • 28. 24-06-2010 Properties and Sub Properties  Properties and classes are defined separately from each other  Property is not owned by any class  Range and domain of properties can be specified 55  What type of resources serve as object and subject  Sub Property  Rdfs:subPropertyOf is used to define one property as a sub-property of another  The sub-property inherits the domain and range definitions of the property © Canopus Consulting What does it look like 56 Data Metadata Inferred Assertions © Canopus Consulting 28
  • 29. 24-06-2010 Properties , domains and ranges  It is still rdf:Property not RDFS:Property, however RDFS adds the notion of a domain and range to an rdf:Property <rdf:Property rdf:ID=“directed_by"> <rdfs:domain rdf:resource=“#MusicDirector"/> <rdfs:range rdf:resource="#FilmSongs"/> 57 </rdf:Property>  Domain – what resources does the property apply to  Range – what are the possible values  If there are 2 domain or 2 range statements, it means both must be true  Range can indicate:  Rdf:resource => either a resource or a literal  Rdfs:datatype © Canopus Consulting Inference based on Domains and Ranges  Challenge – data typing based on use  We have statements about songs and who composed them such as  Endaro Mahanubhavulu isComposedBy Thyagaraja  But we do not have a direct statement that says Thyagaraja, or any one else is an instance of a class called Composers. 58  Suppose we have to list all the composers in our model, what do we do?  This is a very common pattern in transformation rules © Canopus Consulting 29
  • 30. 24-06-2010 Inference based on Domains and Ranges  Answer:  Define the range for isComposedBy to be of class Composers  The RDFS inference will automatically deduce that anything that is specified as the object of isComposedBy is an instance of class Composer 59 Metadata Inferred Assertion Data <owl:ObjectProperty rdf:about="#isComposedBy"> <rdfs:range rdf:resource="#Composer"/> </owl:ObjectProperty> © Canopus Consulting Reinforcing the notion of “Schema”  Domains and ranges are not used for validation - but instead are used to determine new information based on old information  Does this surprise you? 60 © Canopus Consulting 30
  • 31. 24-06-2010 61 WHAT IS OWL © Canopus Consulting Ontologies  RDFS is useful, but does not solve all possible requirements  Complex applications may want more possibilities: 62  characterization of properties  identification of objects with different URIs  Disjoint-ness or equivalence of classes  construct classes, not only name them  can a program reason about some terms? E.g.:  “if «Person» resources «A» and «B» have the same «email» property, then «A» and «B» are identical” © Canopus Consulting 31
  • 32. 24-06-2010 Ontologies (Cont.)  The term ontologies is used in this respect:  “defines the concepts and relationships used to describe and represent an area of knowledge” 63  RDFS can be considered as a simple ontology language  Languages should be a compromise between  rich semantics for meaningful applications  feasibility, implementability © Canopus Consulting OWL - Web Ontology Language  OWL is an extra layer, a bit like RDF Schemas  own namespace, own terms  it relies on RDF Schemas  It is a separate recommendation 64 © Canopus Consulting 32
  • 33. 24-06-2010 OWL Overview  OWL is a large set of additional terms  For classes:  owl:equivalentClass: two classes have the same individuals  EXAMPLE – A:MISIC_DIRECTOR and B:COMPOSER  owl:disjointWith: no individuals in common For properties: 65   owl:equivalentProperty  EXAMPLE – A:VOCALS_BY and B:SINGER  owl:propertyDisjointWith  For individuals:  owl:sameAs: two URIs refer to the same concept (“individual”)  owl:differentFrom: negation of owl:sameAs © Canopus Consulting Classes in OWL  In RDFS, you can subclass existing classes… that’s all  In OWL, you can construct classes from existing ones: 66  enumerate its content  through intersection, union, complement © Canopus Consulting 33
  • 34. 24-06-2010 OWL: Class  OWL:Class is a subset of RDFS:Class  More expressiveness – restrictions, set operations …  Owl:Thing  Every resource that is an instance of a class is automatically a member of OWL:Thing 67  OWL:Nothing -> the empty class, most specialized  Separation of classes and instances  Though the language does not mandate it  Classes contents can be enumerated  The classes consists of exactly of those individuals  Union of classes can be defined  Other possibilities: complementOf, intersectionOf © Canopus Consulting OWL class definition <owl:Class rdf:about="#Composer"> <rdfs:subClassOf rdf:resource="#Person"/> </owl:Class> 68 <owl:Class rdf:about="#Composition"> <rdfs:subClassOf rdf:resource="&owl;Thing"/> </owl:Class> © Canopus Consulting 34
  • 35. 24-06-2010 Equivalence in OWL  Equivalent Class  If Class A equivalentClassOf Class B  => they share the same members  If x belongs to Class A then x also belongs to Class B and vice versa 69  Is equivalent to saying: A (Model) Design Pattern:  A rdfs:subClassOf B How to implement equivalence, when limited to  B rdfs:subClassOf A RDFS vocabulary  Equivalent properties  If propertyA equivalentPropertyOf propertyB  If propertyA applies between resources X and Y, then propertyB also applies © Canopus Consulting OWL - exhaustive  The combination of class constructions with various restrictions is extremely powerful  What we have so far follows same logic as before  extend the basic RDF and RDFS possibilities with new 70 features  define their semantics, ie, what they “mean” in terms of relationships  expect to infer new relationships based on those  However, a full inference procedure is hard  not implementable with simple rule engines, for example © Canopus Consulting 35
  • 36. 24-06-2010 Properties  owl:ObjectProperty is used to connect a resource to another resource  owl:DatatypePropery is used to connect a resource to an rdfs:Literal (untyped) or an XML schema built-in data type (typed) value 71  Both Can have sub-properties <owl:ObjectProperty rdf:about="#isComposedBy"> <rdfs:range rdf:resource="#Composer"/> </owl:ObjectProperty> <owl:DatatypeProperty rdf:about="#hasName"> <rdfs:range rdf:resource="&xsd;string"/> </owl:DatatypeProperty> © Canopus Consulting More on properties  Property can be the inverse of another  Has child – has parent  Usually seen in “containment” type Inverse Property associations  Property can be symmetric ApB => BpA 72  Knows, hasSpouse  Usually abstract forms of more specialized properties,  Often you will find that most symmetric properties are super-properties to others  Property can be asymmetric  hasMother, greaterThan, lesserThan  Property can be transitive. Transitive & Symmetric  isPartOf, contains  Generally seen between entities of similar type © Canopus Consulting 36
  • 37. 24-06-2010 More on properties  Challenge  We have the following statements:  Raaga A isJanyaRaagaOf Raaga B  Raaga C isJanyaRaagaOf Raaga D 73  Raaga E isMelakarthaDerivative of Raaga A  We get additional information that there is something called JanakaRaaga such that if A is JanyaRaaga of B then B is JanakaRaaga of A  I want to:  Get all JanakaRaaga of A  Get all Raagas related to A © Canopus Consulting More on properties  Answer  isJanyaRaaga isInverseOf isJanakaRaaga  isJanyaRaaga, isJanakaRaga, isMelakartaDerivative of are sub-properties of raagaRelations 74 Entailed by OWL itself : inverseOf is inverse of itself In other words, inverseOf is symmetric It is not uncommon to find this pattern in model transformations © Canopus Consulting 37
  • 38. 24-06-2010 Qualifying property membership  Film songs are those songs that have appeared in films <owl:Class rdf:ID=“FilmSongs"> 75 <rdfs:subClassOf rdf:resource="#Songs"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#appearedIn"/> <owl:someValuesFrom rdf:resource="#Films"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class> © Canopus Consulting Qualifying property membership  owl:allValuesFrom  If the property is used, then all values for the property must belong to the class  owl:someValuesFrom 76  If the property is used, at least one of the values must belong to the class  owl:hasValue  Of all the values a class has for a particular property, at least one must be this specific value © Canopus Consulting 38
  • 39. 24-06-2010 Owl:cardinality Likewise cardinality, and max cardinality <owl:Class rdf:ID=“FilmSongs"> <rdfs:subClassOf rdf:resource="#Songs"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#appearedIn"/> 77 <owl:minCardinality rdf:datatype=“XMLSchema#nonNegativeInteger"> 1 </owl:minCardinality> </rdfs:subClassOf> </owl:Class> © Canopus Consulting SPARQL Protocol And RDF Query Language 78 QUERYING RDF © Canopus Consulting 39
  • 40. 24-06-2010 Querying RDF  A collection of rdf statements is a graph  No root - no start no finish  What is returned to a query is a graph  a relational query forms a new table by combining 79 existing tables, an rdf query returns a new graph by combining information from graphs from multiple sources © Canopus Consulting Queries and their responses  Querying an RDF store such as the one we have just built will return exactly what is asserted  Example: 80  Raaga:Sreeragam rdf:type Raaga:Ghana  Raaga:Ghana rdfs:SubClassOf Raaga:CarnaticRaaga A query such as ?x rdf:type Raaga:CarnaticRaaga Will return nothing !! © Canopus Consulting 40
  • 41. 24-06-2010 Retrieval models from RDF Semantics Semantic Query Meaning of data & relationships (e.g SPARQL) Internally represented as 81 Structure Graph Traversal (e.g. Squish) A graph of triples connected to each other Serialized using Syntax Syntactic Query (e.g XPath/Xquery) Formats such as XML/RDF or Turtle © Canopus Consulting Syntactic Level Query <rdf:RDF <Raaga rdf:about="#satmel"> <rdf:type rdf:resource="&owl;Thing"/> <hasSwara rdf:resource="#ni"/> <hasSwara rdf:resource="#nu"/> <hasSwara rdf:resource="#pa"/> <hasSwara rdf:resource="#ra"/> <hasSwara rdf:resource="#sa"/> </Raaga> <owl:Thing rdf:about="#shuddhatodi"> 82 <rdf:type rdf:resource="#Raaga"/> <isMelakartaDerivative rdf:resource="#hanumatodi"/> </owl:Thing> for $r in document(music.owl) where $r/Raaga/@rdf:about = ‘#satmel’  Select a raga – (Xpath/XQuery) return $r/Raaga/hasSwara  /RDF/Raga/@rdf:about=“#satmel”  /RDF/Thing/@rdf:about=“#shuddhatodi”  Limitations of this approach  Can go out of hand very quickly  Does not understand the semantics of RDFS & OWL  Tied to the structure of RDF (which can be expressed in many ways) © Canopus Consulting 41
  • 42. 24-06-2010 Structural Query Subject Predicate Object perceptron:satmel rdf:type perceptron:Raaga perceptron:shuddhatoodi rdf:type perceptron:MelakarthaRaaga perceptron:satmel perceptron:hasSwara perceptron:sa perceptron:MelakarthaRaaaga owl:subClassOf perceptron:Raaga … 83  Possible queries  Select * from Triples where Object = “owl:Raaga” and Predicate = “rdf:type”  Select Predicate from Triples where Subject = “perceptron:satmel”  Limitations of this approach  Interprets any RDF model as just a set of Triples  Does not understand the semantics of RDFS & OWL  E.g. Looking for all raagas will fail here since shuddhatoodi is asserted to be a MelakarthaRaaga, while it is a subClassOf Raaga is in the semantics of the next triple © Canopus Consulting Querying at the Semantic Level  Need a new language that can understand the semantics of RDFS & OWL  Sample Queries  Select ?x from <perceptron.net> where ?x <rdf:type> Raaga  All MelakarthaRaagas will also be returned even though they are not explicitly asserted to be a Raaga 84  Can draw inferences from the rules of RDFS and OWL  Either computing and storing the closure of a given model  Or Infer new statements as needed by the query on the fly  Not tied to how the data is stored or serialized  Special purpose query languages designed to facilitate this  RQL, SPARQL, TQL  SPARQL has emerged as the industry standard © Canopus Consulting 42
  • 43. 24-06-2010 Introducing SPARQL  Designed to query collections of triples…  …and to easily traverse relationships  SQL-like syntax (SELECT, WHERE)  Matches graph patterns 85 © Canopus Consulting SPARQL – Key Characteristics  Graph pattern matching ability  Capability to restrict matches on a queried graph by providing a graph pattern  Which consists of one or more RDF triple patterns, to be satisfied in a query  Variable binding results  Returns zero or more bindings of variables.  Each set of bindings is one way that the query can be satisfied by the queried graph 86  Sub-graph results  It must be possible for query results to be returned as a sub-graph of the original graph  Result limits  Possible to specify an upper bound on the number of query results returned  Streaming results  Possible for the client to request that results be streamed  WSDL support  The protocol – including its interfaces, their operations, results, and types are described using WSDL © Canopus Consulting 43
  • 44. 24-06-2010 Sample Model Fragment 87 PREFIX perceptron: http://www.perceptronnetwork.com/ontologies/2010/... SELECT ?raaga WHERE {?raaga rdf:type perceptron:Raaga } © Canopus Consulting Let us take a closer look PREFIX perceptron: http://www.perceptronnetwork.com/ontologies/2010/... SELECT ?raaga WHERE {?raaga rdf:type perceptron:Raaga }  PREFIX  Defines an alias for the namespace 88  SELECT  Select query  FROM  Optional clause, specifies the URI of the model  Variables  Marked by either ? or $  WHERE  Usually the most significant part of a SPARQL query  Each triple is one condition (filter on the graph) (expressed using Turtle Syntax) © Canopus Consulting 44
  • 45. 24-06-2010 SPARQL Select  Challenge  Number of Raagas is huge, paginate the results PREFIX perceptron: <someURI> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?s FROM <rmi://localhost/server1#perceptron> 89 WHERE {?s rdf:type perceptron:Raaga} LIMIT 2 OFFSET 2 ORDER BY DESC(?s)  DISTINCT – remove duplicate results  LIMIT n – Limit the returned values to n rows  OFFSET n – Offset the first n nodes  Limit and Offset together can be used to paginate the results  ORDER BY – To sort the results, can be ASC, DESC © Canopus Consulting Example of a Complex Query  Challenge  Search for the top 10 songs (by popularity) in my favorite Raaga SELECT ?rendition FROM <rmi://localhost/server1#perceptron> FROM NAMED <my-preferences.rdf> FROM NAMED <popularity.rdf> 90 WHERE { GRAPH <my-preferences.rdf> { ?me dc:name “My Name" . ?me prefs:favouriteRaga ?fav_raga . } . ?composition perceptron:hasRaaga ?fav_raga . ?rendition perceptron:hasComposition ?composition . GRAPH <my-preferences.rdf> { ?rendition pop:popularity ?popularity } } ORDER BY DESC[?popularity] LIMIT 10 © Canopus Consulting 45
  • 46. 24-06-2010 How this Query Would Work My-favourites.rdf perceptron.owl “My Name” dc:name Mohanam ?fav_raga Person:XXX prefs:favouriteRaaga perceptron:hasRaaga 91 ?me Chandracharita ?composition pop:popularity 1 perceptron:isRenditionOf 1 1 ?popularity rend- chandracharita- popularity.rdf bmk ?rendition © Canopus Consulting Additional SPARQL constructs  INSERT  INSERT DATA { triples block }  DELETE  DELETE DATA { triples block }  OPTIONAL  Used to define an optional condition 92  UNION  For alternate queries  FILTER  Value based filters  ASK  Returns true or false depending upon the condition  DESCRIBE  Returns the node graph for the given condition © Canopus Consulting 46
  • 47. 24-06-2010 Technologies & Tools 93 SEMANTIC WEB INFRASTRUCTURE © Canopus Consulting Objective  How do we build an application?  How do we build the ontology?  What are the key architecture components?  What are the tools & technologies to use? How do I choose which technology to use? 94  © Canopus Consulting 47
  • 48. 24-06-2010 Semantic Web Application Lifecycle  Information Modelling  Build Ontology (model level representation)  Information Assimilation  Populate Knowledgebase from various sources 95  Including current applications  Automatic Semantic Annotation of existing data  Any type of document, multiple sources of documents  Information Retrieval  Applications: search, integrate/portal, summarize/ explain, analyse, decisions support  Reasoning techniques: graph analysis, inferencing © Canopus Consulting Semantic Web Application Lifecycle Ontology Editors: Protégé, TopBraid Composer 96 Build Information Model Semantic Query Server Refine/Evolve Information Model Create Assimilation Models & Aggregate knowledge RDF Stores: Mulgara, Sesame Technologies: GRDDL, RDFizers, Programming: Jena OWLs, Automatic Annotation Retrieve and Use Semantic Data © Canopus Consulting 48
  • 49. 24-06-2010 Information Modeling  Information Model Consists of  Description Component – Schema  Designed by domain experts, community  Description Base – Assertions, Extensions  Automated agents who assimilate the information 97  The model is an evolutionary process  Start with a concept map or a taxonomy  Leading up to a formal ontology  It evolves over lifetime of the application  Tools & Technologies  Popular modeling tools – Protégé, TopBraid Composer, GrOWL  Concept Mapping – CMAP Tools, CMAP Tools COE  http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html © Canopus Consulting RDF Stores, Triple Stores 98 PERSISTING RDF DATA © Canopus Consulting 49
  • 50. 24-06-2010 Storage Models for RDF Hand-crafted SQL,  SQL Storage ORM  Direct SQL access to RDF data  Queries become complex as discussed earlier  Can’t completely deal with the semantics of data 99  SQL Bridge Tools like – SDB, Squirrel RDF  Access via SPARQL interface  Adaptor transforms SPARQL query into SQL  Data stored in relational schema  Native RDF Native non-relational DB – such as  Access via SPARQL interface Mulgara  Data stored in native RDF databases © Canopus Consulting Storage Models for RDF SQL Storage SQL Bridge Native RDF Application Application Application 10 0 SPARQL SPARQL SQL Queries Queries Queries Adaptor Relational Relational Relational Native RDF Schema Triples Triples Graph © Canopus Consulting 50
  • 51. 24-06-2010 Storage Models for RDF SQL Storage SQL Bridge Native RDF Application Application Application Traditional RDF approach Databases 10 1 SPARQL SPARQL SQL Queries Queries Queries Adaptor Relational Relational Relational Native RDF Schema Triples Triples Graph © Canopus Consulting Storing RDF Data – Relational Model  Relational Schema  Data model consisting of well defined tables, columns and their meaning  To provide flexibility, we would have to use techniques that dynamically update the schema  Still makes each row alike – whereas the fundamental premise of RDF is that each row is potentially unique 10 2 © Canopus Consulting 51
  • 52. 24-06-2010 Storing RDF Data – Triple forms  Storing RDF predicates in a relational database  Use the attributes as “extended properties”  Simple triple forms (subject – predicate – object)  Predicate tables (one table per predicate)  Gives the requisite flexibility but makes indexing & retrieval difficult 10 3 © Canopus Consulting Storing RDF Data –RDF Stores  RDF Stores  Provide a mechanism to store RDF data  Provide a mechanism to query it using languages such as SPARQL  Internally may or may not be based on relational databases  Also known as Graph Databases or Schema-less Databases 10  Although not all Graph databases support RDF 4 Jena SDB, TDB Oracle 11g RDF Database Mulgara Semantic Store OpenLink Virtuoso AllegroGraph OpenRDF Sesame © Canopus Consulting 52
  • 53. 24-06-2010 Programming APIs, Technology Stack 10 5 PROGRAMMING FOR THE SEMANTIC WEB © Canopus Consulting Let us Revisit SPARQL  SPARQL  SPARQL Protocol and RDF Query Language  It is both a Protocol and a Query Language  Protocol Definition 10 6  Defines a mechanism of invoking a query over the web  Equivalent to a web service definition  Defines one operation  Query - with the input, output and fault definitions  Defines the protocol bindings  HTTP (get/post)  SOAP (web service)  An implementation may provide ANY of the above two © Canopus Consulting 53
  • 54. 24-06-2010 Architecture Stack of Semantic Technologies Application HTTP SOAP Programming API Semantic Middleware 10 e.g. Semantic SOA SPARQL Processor Inference Engine 7 RDF-SQL Adaptor Relational RDF Store Store Semantic Technology Stack © Canopus Consulting RDF Programming Stack - Jena  The most popular stack for Java is Jena (http://jena.sourceforge.net)  Jena also has an in-memory graph manipulation API  Jena APIs for RDF, RDFS, OWL is one of the most popular APIs  It uses a Graph based model and pluggable architecture  It allows plugging various storages, reasoners etc. to the API 10  Many RDF stores either have or are building support for Jena API 8 Jena uses a Graph model as the core All reasoners also assert inferences into a Graph Thus the output of a reasoner can be fed into another layer It is a common pattern with Jena to do such layering © Canopus Consulting 54
  • 55. 24-06-2010 Components of Jena Stack Joseki HTTP SOAP Jena API Programming API ARQ 10 SPARQL Processor Inference Engine 9 RDF-SQL Adaptor TDB SDB (Abstraction) Relational RDF Store Store Semantic Technology Stack © Canopus Consulting Connecting to RDF Database – RDF Programming APIs  RDF programming APIs are available in almost all major languages  C, C++, C# and .Net, Haskell, Java, Javascript, Lisp,  Obj-C, PHP, Perl, Prolog, Python, Ruby, Tcl/Tk  There is no standard client API (yet!) - equivalent of JDBC  Each RDF package has its own set of APIs 11  Most often they follow similar paradigms of Connections, Factories, Sessions 0  The query syntax however is standardized  Thus queries are portable across RDF stores  At the same time, many packages have their own “dialects” of RDF query languages  Mulgara is one of the most popular open source RDF databases  Has its own API to connect to the RDF Database  In addition to SPARQL, has its own dialect (called iTQL) © Canopus Consulting 55
  • 56. 24-06-2010 Connecting to Mulgara Database URI SERVER_URI = URI.create("rmi://servername/instancename"); URI GRAPH_URI = URI.create("rmi://servername/instancename#model"); String query = "SELECT ?x WHERE { ?x <p:isDerivedFrom> <p:Kalyaani> }"; try { // Creating a new connection from the factory using the server URI ConnectionFactory factory = new ConnectionFactory(); Connection connection = factory.newConnection(SERVER_URI); 11 // Initialize the SPARQL interpreter with the graph URI 1 SparqlInterpreter interpreter = new SparqlInterpreter(); interpreter.setDefaultGraphUri(GRAPH_URI); // Parse and execute the query on the connection Query query = interpreter.parseQuery(queryStr); Answer a = connection.execute(query); // Use the results RdfXmlEmitter.writeRdfXml((GraphAnswer) a, System.out); Sample Java } code to connect catch (Exception e) { to Mulgara e.printStackTrace(); Database using } Mulgara Client finally { API connection.dispose(); } © Canopus Consulting Adopting Existing SQL Databases 11 2 Source: Tim Berners-Lee © Canopus Consulting 56
  • 57. 24-06-2010 Semantic Web Technologies 11 3 Source: W3C © Canopus Consulting 11 4 APPENDIX © Canopus Consulting 57
  • 58. 24-06-2010 RDF Triples  Resources can use any URI, e.g.:  http://www.example.org/file.xml#element(home)  http://www.example.org/file.html#home  http://www.example.org/file2.xml#xpath1(//q[@a=b]) 11 5  URI-s can also denote non Web entities:  http://www.ivan-herman.net/me is me  not my home page, not my publication list, but me  RDF triples form a directed, labelled graph © Canopus Consulting RDF Primitives  Resources  Anything that can be uniquely identified  Using a URI  E.g. http://www.perceptron.net/indianmusic#A R Rahman 11 Namespace Fragment ID 6  Properties  Are resources in themselves  Literals RDF is a  Strings general model for  With optional data types triples © Canopus Consulting 58
  • 59. 24-06-2010 OWL Full  No constraints on any of the constructs  owl:Class is just syntactic sugar for rdfs:Class  owl:Thing is equivalent to rdfs:Resource  this means that: 11 7  Class can also be an individual, a URI can denote a property as well as a Class  e.g., it is possible to talk about class of classes, apply properties on them  Extension of RDFS in all respects  But: no system may exist that infers everything one might expect © Canopus Consulting OWL Full usage  Nevertheless OWL Full is essential  it gives a generic framework to express many things  some application just need to express and interchange terms 11 8  Applications may control what terms are used and how  in fact, they may define their own sub-language via, eg, a vocabulary  thereby ensuring a manageable inference procedure © Canopus Consulting 59