SlideShare a Scribd company logo
1 of 60
Download to read offline
Wissenstechnologie WS 08/09



                        Michael Granitzer
           IWM TU Graz & Know-Center
                         Know Center


           Lecture 6: T iple Sto es Sparql,
           Lect e 6 Triple Stores, Spa ql
                   Semantic Retrieval
 http://kmi.tugraz.at
 http://kmi tugraz at                             http://www.know-center.at
                                                  http://www know center at
 This work is licensed under the Creative Commons Attribution 2.0 Austria License.
 To view a copy of this license, visit http://creativecommons.org/licenses/by/2.0/at/.
Today



           Ontology Modelling & SW Frameworks


           Triple Stores
           • Basic RDBMS scheme
           • Property tables & vertical Partitioning
           • Performance Comparisons

           SPARQL
           • Definition
           • Simplex & Complex Queries
           • Some examples on Endpoints

                                 vs.
           Information Retrieval vs Semantic Retrieval
           • Basics of IR
           • „Semantic“ Retrieval
           • Practical Examples (Freebase, Cugil etc.)
                                                                          2
                                                         http://kmi.tugraz.at

WS 08/09          Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                          Frameworks




Ontology Modelling
                                                                          Triple Stores


                                                                          SPARQL




vs.
vs OOP                                                                    Information Retrieval vs.
                                                                          Semantic Retrieval




   Similar to design in Object Oriented Programming
      Classes, objects and members
      Capture th operational properties
      C t     the     ti   l       ti

              public interface Course {
                  public void enroll()
                     bli    id     ll()
                }

   Ontology Modelling: Capture the structural properties

                             owl:participates   owl:Course
           owl:Student

                                                                                      3
                                                              http://kmi.tugraz.at

WS 08/09                 Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




Ontology Modelling
                                                                            Triple Stores


                                                                            SPARQL




vs.
vs RDBMS                                                                    Information Retrieval vs.
                                                                            Semantic Retrieval




   Similar in designing a database system
        Higher expressiveness in OWL      Aggrement on the domain not
        only referential integrity
        Not focused on special indexing structures or on querying only
        Ontologies should be application independent
        Consistency checks
   Semantic Integration via Ontologies




                                                                                        4
    Product      File System   Employee   Text            ...
    Database                   Database   Database              http://kmi.tugraz.at

WS 08/09             Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                   Frameworks




Ontology Modelling
                                                                   Triple Stores


                                                                   SPARQL




Goals                                                              Information Retrieval vs.
                                                                   Semantic Retrieval




 Goals
   Share common understanding among people or software
   Enable reuse of knowledge
   Make domain assumptions explicit
   Separate domain knowledge from operational knowledge
   Analyze domain knowledge
 Main Application Areas
   Semantic harmonization of heterogeneous data sources
   Structuring the content of a portal
   Enhance search and retrieval
                                                                               5
   …
                                                       http://kmi.tugraz.at

WS 08/09          Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                  Frameworks




Ontology Modelling
                                                                  Triple Stores


                                                                  SPARQL




Aspects to model                                                  Information Retrieval vs.
                                                                  Semantic Retrieval




   Defining classes in the ontology


   Arranging classes in a taxonomy
        g g                      y


   Defining slots/properties for classes and their values


   Define logical constraints on classes/properties


   Assign instances


                                                                              6
                                                      http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




Ontology Modelling
                                                                 Triple Stores


                                                                 SPARQL




Three simple rules                                               Information Retrieval vs.
                                                                 Semantic Retrieval




 1. There
 1 “There is no one correct way to model a domain there
                                           domain—
    are always viable alternatives. The best solution almost
    always depends on the application that you have in
    mind and the extensions that you anticipate ”
                                       anticipate.
 2. “Concepts in the ontology should be close to objects
    (physical or logical) and relationships in your domain of
    interest. These are most likely to be nouns (objects) or
    verbs (relationships) in sentences that describe your
    domain.
    domain ”
 3. “Ontology development is necessarily an iterative
    process
    process”

                                                                             7
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




Ontology Modelling
                                                                 Triple Stores


                                                                 SPARQL




Noy s
Noy‘s and McGunnise 7 Steps                                      Information Retrieval vs.
                                                                 Semantic Retrieval




 1.
 1 Determine the domain and scope of the ontology
 2. Consider reusing existing ontologies
 3 E
 3. Enumerate i
           t important terms in the ontology
                  t tt       i th     t l
 4. Define the classes and the class hierarchy
 5. Define the properties (slots) of classes
 6. Define the facets of the slots
 7. Create/Import instances



                                                                             8
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




Semantic Web Frameworks
                                                                 Triple Stores


                                                                 SPARQL




Motivation                                                       Information Retrieval vs.
                                                                 Semantic Retrieval




   Protege as modelling GUI
   For „Semantic Web Applications“ we want also to
      A t
      Automatically i
            ti ll import/map i t
                       t/    instances
      Manage large number of triples
      Combine different schemas
      Query for specific triples
      Harmonize different metadata schemas
   Database requirements for graphs
   Reasoning
                                                                             9
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




Semantic Web Frameworks
                                                                            Triple Stores


                                                                            SPARQL




Overview                                                                    Information Retrieval vs.
                                                                            Semantic Retrieval




   Three „major Java based Open Source frameworks
         „major“
       Jena
       Sesame
       Protege Java API
   Functionality
       Java API for managing OWL, RDF and RDFS (optional DAML+OIL)
       Import/Export of different formats
       Persistence via own data store, different database and file system
       backend
       Querying, Graph manipulation and restricted reasoning capabilities
       Web API
                                                                                        10
                                                                http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                       Frameworks




Semantic Web Frameworks
                                                                       Triple Stores


                                                                       SPARQL




Jena Architecture                                                      Information Retrieval vs.
                                                                       Semantic Retrieval




                  SPARQL




RDF/XML




                                        Jena: Implementing the
                                        Semantic Web Recommendations – 2003
                                        http://www.hpl.hp.com/techreports/2003/H
                                                                                   11
                                        PL-2003-146.html
                                                           http://kmi.tugraz.at

WS 08/09   Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                     Frameworks




Semantic Web Frameworks
                                                                     Triple Stores


                                                                     SPARQL




Main Differences                                                     Information Retrieval vs.
                                                                     Semantic Retrieval




   Jena
      Reference implementation
      Not directly focused towards web access and scalability
   Protege
      Modelling GUI
   Sesame
      Focused towards remote access and scaleability
      Flexible Layer architecture for different storage backends
   Others: Virtuoso 3Store Kowari OpenAnzo
           Virtuoso, 3Store, Kowari,

                                                                                 12
                                                         http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                  Frameworks




Triple Stores
                                                                  Triple Stores


                                                                  SPARQL




Overview
                                                                  Information Retrieval vs.
                                                                  Semantic Retrieval




   Basic data model is RDF (i e OWL, RDFS)
                           (i.e. OWL
   RDF forms an directed graph
   How do we manage large graphs
      In Memory          Adjacency Matrix
      On secondary storage

           – Special Indices
           – Use relational database management systems
                                                                              13
                                                      http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                        Frameworks




Triple Stores                                                                                           Triple Stores


                                                                                                        SPARQL




„Normalized Table Model of RDF
 Normalized“
                                                                                                        Information Retrieval vs.
                                                                                                        Semantic Retrieval




Subject                                        Predicate              Object

http://book.at/isbn123                         author                 http://fussball.de/G. Müller

http://book.at/isbn123                         price                  €15

http://book.at/isbn123                         Title                  Ein Leben für die Tore


http://fussball.de/G. Müller                   Name                   Gerd Müller




                                           author
            http://book.at/isbn123                      http://fussball.de/G. Müller


                                                                                name
            price                      title

                                                                                                                    14
                                     Ein Leben für die Tore                   Gerd Müller
             €15                                                                            http://kmi.tugraz.at

WS 08/09                       Wissenstechnologie @ kmi.tugraz.at
Triple Stores
Query in an unoptimized RDBMS

                                                      Select r3.o as Title from rdf
   Query: Titles of books from the personwith name where
                                                      r1, rdf r2, rdf r3
   Gerd Müller?                                                   r1.s = r2.o AND
                                                                  R2.s = r3.s AND
                                                                  r1.o = ‘Gerd Müller’ AND
         Subject (s)       Predicate          Object (o)
                                                                  r1.p = ‘Name’ AND
                              (p)
                                                                  r2.p = ‘author’ AND
   http://book.at/isbn123 author     http://fussball.de/G. Müller R3.p = ‘Title’
                                                                     p


    http://book.at/isbn123     price       €15


    http://book.at/isbn123     Title       Ein Leben für die Tore



    http://fussball.de/G.
    http://fussball de/G       Name        Gerd Müller
    Müller

                                                                                       15
                                                                      http://kmi.tugraz.at

WS 08/09                    Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                                     Frameworks




Triple Stores
                                                                                                                     Triple Stores


                                                                                                                     SPARQL




The Sesame Mapping as example
                                                                                                                     Information Retrieval vs.
                                                                                                                     Semantic Retrieval




                   y
See Hak Soo Kim, Hyun Seok Cha, Jungsun Kim, Jin Hyun Son,, Development of the Efficient OWL Document Management
                                     g               y                p                                    g
                                                                                                                                 16
System for the Embedded Applications, Springer 2005, http://www.springerlink.com/content/8mfxeh0glq5xj00m/
                                                                                                         http://kmi.tugraz.at

WS 08/09                        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                    Frameworks




Triple Stores                                                                       Triple Stores


                                                                                    SPARQL




Indexing Techniques
                                                                                    Information Retrieval vs.
                                                                                    Semantic Retrieval




   Use specialised indices for graphs
      Bitmap indices in Virtuoso
      http://virtuoso.openlinksw.com/wiki/main/Main/VOSBitmapIndexing

   Index different combinations of the S,P,O Table
      P,S,O
      O,P,S
      O,S,P
      S,O,P




                                                                                                17
                                                                        http://kmi.tugraz.at

WS 08/09             Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




Triple Stores
                                                                 Triple Stores


                                                                 SPARQL




A first Analysis
                                                                 Information Retrieval vs.
                                                                 Semantic Retrieval




   Normalised view on a graph: one large table
   Generic and flexible, but
      Large self j i t f rather simple queries. RDBMS are
      L        lf joints for th    i l     i
      usually not optimized for this
      Large memory overhead in query processing due to
      self joints
      Requires lot of index lookups and/or full table scans
      Large storage overhead
   In
   I general: fl ibilit vs. performance
           l flexibility       f
   How to improve?                                                           18
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                                                                                 Frameworks




        Triple Stores                                                                                                                                            Triple Stores


                                                                                                                                                                 SPARQL




        Further improvements
                                                                                                                                                                 Information Retrieval vs.
                                                                                                                                                                 Semantic Retrieval




                    p y                       p             y       g
                 Property tables: flattened representation by finding
                 sets of properties which are used together
                 Subject-Property Matrix Materialized Join Views
                 (SPMJVs) from Oracle
                 Ch
                 Chong, E. I., Das, S., E d
                        E I D       S Eadon, G and S i i
                                             G.,                J. 2005 A     ffi i t SQL b  d
                                                  d Srinivasan, J 2005. An efficient SQL-based
                 RDF querying scheme. In Proceedings of the 31st international Conference on
                 Very Large Data Bases, ACM




                                                                                                                                                                             19
Abadi, D. J., Marcus, A., Madden, S. R., and Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd
international Conference on Very Large Data Bases (Vienna, Austria, September 23 - 27, 2007). Very Large Data Bases. VLDB Endowment, 411-422.            http://kmi.tugraz.at

         WS 08/09                                   Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                          Frameworks




Triple Stores                                                             Triple Stores


                                                                          SPARQL




Property Tables
                                                                          Information Retrieval vs.
                                                                          Semantic Retrieval




   ++: Faster querying within a property tables due to reducing
   subject-subject self joins
   --: Requires intelligent selection of the properties in the table
      More property colums lead to more null values in the table
      and therefore to larger space overhead
      Lesser property colums lead to more property tables
      more joins over lesser property tables
   --: Multi valued properties are hard to manage (e.g. a book has
     :                                            (e g
   several authors)

    Subject     Title                Author            Year
    ID1         “Intro to RDF”       Granitzer         2006
    ID1         “Intro to RDF”       Tochtermann       2006                           20
                                                              http://kmi.tugraz.at

WS 08/09          Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                                                                           Frameworks




Triple Stores                                                                                                                                              Triple Stores


                                                                                                                                                           SPARQL




Vertical Partitioning
                                                                                                                                                           Information Retrieval vs.
                                                                                                                                                           Semantic Retrieval




   Partition database according to properties – one table per
   property
   Abadi, D. J., Marcus, A., Madden, S. R., and Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the
   33rd international Conference on Very Large Data Bases (Vienna, Austria, September 23 - 27, 2007). Very Large Data Bases. VLDB Endowment, 411-422.

   Tables are sorted by subject                                                  allows fast merge sort joins




                                                                                                                                                                       21
                                                                                                                                         http://kmi.tugraz.at

WS 08/09                               Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                     Frameworks




Triple Stores                                                        Triple Stores


                                                                     SPARQL




 Vertical Partitioning
                                                                     Information Retrieval vs.
                                                                     Semantic Retrieval




   ++: Use of simple fast merge joints
              simple,
   ++: Multi valued attributes are supported
   ++: No a-priori clustering decision is necessary
          a priori
   ++: Smaller tables. Only those properties accessed have to be
   read from disk
   --: Insert may be slower due to access to multiple tables
   --: Queries over multiple properties span over multiple tables




                                                                                 22
                                                         http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                              Frameworks




Triple Store
                                                                              Triple Stores


                                                                              SPARQL




Performance of Open Source Solutions
                                                                              Information Retrieval vs.
                                                                              Semantic Retrieval




   Portwin & Parvatikar (2006) Scaling Jena in a Commercial Environment:
   The Ingenta MetaStore Project
   LEGHIGH Dataset with domain universities
      ~200 million triples, 11 Millionen OWL Statements, 4.3 millionen
      documents
      Kowari: 1 billion triple, load 20k Triple/s for Wikipedia data set

   Unoptimized
      Simple query take milliseconds
      With inference queries take several seconds to minutes
      depending on the complexity
      Optimization for Inference: for RDFS entailment is to
      expand the graph by making implicit edges explicit
        more storage but faster access                                                    23
                                                                  http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                        Frameworks




Triple Store                                                                                            Triple Stores


                                                                                                        SPARQL




Performance of Oracle
                                                                                                        Information Retrieval vs.
                                                                                                        Semantic Retrieval




    BioMed literature database (UniProt data set)
    80 million triples
      5             d t ( 25    T i l 17       M   i
    ~5 GB RDF/XML data (~2,5 GB Triple; 1,7 GB Mapping;
    4,8 GB Indices)
    Queries take milliseconds to secondes
    Subject-property matrix materialized views provide
    optimization potential of roughly ~30%
                                       30%
 Chong, E. I., Das, S., Eadon, G., and Srinivasan, J. 2005. An efficient SQL-based RDF querying scheme. In
    Proceedings of the 31st international Conference on Very Large Data Bases, ACM




                                                                                                                    24
                                                                                            http://kmi.tugraz.at

WS 08/09                    Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                    Frameworks




Triple Store
                                                                    Triple Stores


                                                                    SPARQL




Performance Summary
                                                                    Information Retrieval vs.
                                                                    Semantic Retrieval




   http://esw.w3.org/topic/LargeTripleStores
   Problem: Comparison among performance numbers available
   Trade-off Generic vs Performance
   Trade off         vs.
   Optimization potential is available
   Currently not as fast as specialised RDBMS but more flexible
                                        RDBMS,




                                                                                25
                                                        http://kmi.tugraz.at

WS 08/09          Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                               Frameworks




SPARQL
                                                                               Triple Stores


                                                                               SPARQL



SPARQL Protocol and RDF Query Language                                         Information Retrieval vs.
                                                                               Semantic Retrieval




   Different languages similar to SQL in RDBMS
       SerQL, RDF, SPARQL
       SPARQL currently proposed recommendation of the W3C
   But what does querying a graph mean?
   Basically
       Specify a sub-graph with variable nodes
       Find all patterns in the graph matching the sub-graph

   ?              author
                              Gerd Müller   Select ?x, ?y
                                            where ?x <author> “Gerd Müller”.
               title                                 ?x <title> ?y.
                                                                ?y

                       ?                                                                   26
                                                                   http://kmi.tugraz.at

WS 08/09                   Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                    Frameworks




SPARQL
                                                                    Triple Stores


                                                                    SPARQL



Example                                                             Information Retrieval vs.
                                                                    Semantic Retrieval




   Daten:
 http://example.org/book/book1
    http://purl.org/dc/elements/1.1/title
    quot;SPARQL Tutorialquot; .

   Abfrage:
 SELECT ?title
    WHERE {
    <http://example.org/book/book1>
    <http://purl.org/dc/elements/1.1/title>
    ?title .
    }

   Ergebnis:
 title
    quot;SPARQL Tutorialquot;



                                                                                27
                                                        http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                      Frameworks




SPARQL
                                                                      Triple Stores


                                                                      SPARQL




Example                                                               Information Retrieval vs.
                                                                      Semantic Retrieval




   Data:
 @prefix   foaf: <http://xmlns.com/foaf/0.1/> .
    _:a    foaf:name  quot;Johnny Lee Outlawquot; .
    _:a    foaf:mbox  <mailto:jlow@example.com> .
    _:b    foaf:name  quot;Peter Goodguyquot; .
     :b
    _:b    foaf:mbox  <mailto:peter@example org> .
                      <mailto:peter@example.org>
   Query:
 PREFIX foaf:   http://xmlns.com/foaf/0.1/
    SELECT ?name ?mbox
    WHERE {
    ?x foaf:name ?name .
    ?x foaf:mbox ?mbox
    }
   Result:
   Res lt
 name   mbox
    quot;Johnny Lee Outlawquot;   mailto:jlow@example.com
    quot;Peter Goodguyquot;       <mailto:peter@example.org>



                                                                                  28
                                                          http://kmi.tugraz.at

WS 08/09             Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




SPARQL
                                                                            Triple Stores


                                                                            SPARQL



Simple Query Elements                                                       Information Retrieval vs.
                                                                            Semantic Retrieval




   Determine the Namespace: PREFIX
   Determine the return format
       SELECT: Table output format similar to SQL Results
       CONSTRUCT: Allows to construct a graph as return value
       ASK: Returns only true/false depending of the result exists or not
       DESCRIBE: return possible properties/ressources for a particular
       query. Used for browsing.
   Specify the selection criteria with the WHERE Clause
       Specify a non-recursive sub-pattern with triples and placeholders (?
       Or $)
       Perform Grouping and Filter Operations
   Modifiers: ORDER BY, LIMIT, OFFSET, DISTINCT
                                                                                        29
                                                                http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                 Frameworks




SPARQL
                                                                                 Triple Stores


                                                                                 SPARQL



Blank Nodes                                                                      Information Retrieval vs.
                                                                                 Semantic Retrieval




   ID of Blank Nodes is unique within one query and indicate only the
   existence of a blank node not it‘s absolute value
   Blank nodes are identified by an automatically generated URI
   Consider the results of a query



                           ≡                           ≠
    Subject   Value             Subject   Value            Subject       Value
    _:a
      a       “zum”
              “ m”              _:x       “zum”
                                          “   ”            _:z           “zum”
    _:b       “Beispiel”        _:y       “Beispiel”       _:z           “Beispiel”


   Blank nodes may be renamed and are structural elements only




                                                                                             30
                                                                     http://kmi.tugraz.at

WS 08/09              Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                 Frameworks




SPARQL
                                                                                 Triple Stores


                                                                                 SPARQL



Complex Queries                                                                  Information Retrieval vs.
                                                                                 Semantic Retrieval




   Combination of groups of simple graph expressions in the WHERE
   clause
   OPTIONAL clause: Subgraph pattern may not exist
   Example for querying book titles from Springer
      if an author exists, it will be listed if not the title is returned
      without a author

       SELECT ?title ?author
       WHERE
       { ?buch ex:pulishedFrom http://springer.com/Verlag .
                   p               p // p g       /     g
            ? Buch ex:Title ?title .
          OPTIONAL {?buch ex:Autor ?author }.
       }



                                                                                             31
                                                                     http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




SPARQL
                                                                            Triple Stores


                                                                            SPARQL



Complex Queries                                                             Information Retrieval vs.
                                                                            Semantic Retrieval




   Specifying alternative sub graph patterns: UNION
   Logical OR or union of two separat queries

   SELECT ?title ?author
   WHERE
   { ?buch ex:pulishedFrom http://springer.com/Verlag .
        ? Buch ex:Title ?title .
        {?buch
        {?b h ex:Autor ? th .} UNION
                  A t ?author }
       {?buch ex:Creator ?author .}
   }

   „Select all books with a title published by Springer which have an
   author or an creator assigned“
   Note: ?author in the different groups are independent of each other


                                                                                        32
                                                                http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




SPARQL
                                                                            Triple Stores


                                                                            SPARQL



Complex Queries                                                             Information Retrieval vs.
                                                                            Semantic Retrieval




   Considering special datatypes: FILTER and XML Datatypes
   Specify the data type of a literal
    SELECT ?title ?author
    WHERE
    { ?buch ex:pulishedFrom http://springer.com/Verlag .
         ? Buch ex:Title ?title .
                              „1998   xsd:integer
         ?buch ex:publishedIn „1998“^^xsd:integer
    }
   FILTER specifies boolean expressions for filtering results
   E.g. Specify the data type range using FILTER
   (see Chapter 7 in Semantic Web Grundlagen)

   SELECT ?title ?author
   WHERE
   { ?buch ex:pulishedFrom http://springer.com/Verlag .
        ?buch ex:publishedIn ?year .
        FILTER(?year >2000)                                                             33
   }
                                                                http://kmi.tugraz.at

WS 08/09                   Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                                          Frameworks




SPARQL
                                                                                                                          Triple Stores


                                                                                                                          SPARQL



Real World Examples in DBPedia                                                                                            Information Retrieval vs.
                                                                                                                          Semantic Retrieval




 PREFIX p: <http://dbpedia.org/property/>
 PREFIX rdf: <http://www.w3.org/1999/02/22 rdf syntax ns#>
             <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
 SELECT * WHERE {
   ?album p:artist ?band.
   ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>.
   OPTIONAL {?album p:cover ?cover}.
   OPTIONAL {?album p:name ?name}.
                {         p            }                             PREFIX p: <http://dbpedia.org/property/>
   OPTIONAL {?album p:released ?dateofrelease}.                      PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                                                     CONSTRUCT {
  } ORDER BY DESC(?name) LIMIT 20 OFFSET 19
                                                                      ?album p:itIsDone ?dateofrelease .
                                                                      ?band p:isBand quot;truequot; .
                                                                     } WHERE {
                                                                        ?album p:artist ?band.
                                                                                        ?band
                                                                        ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>.
                                                                        OPTIONAL {?album p:cover ?cover}.
                                                                        OPTIONAL {?album p:name ?name}.
                                                                        OPTIONAL {?album p:released ?dateofrelease}.
                                                                       } ORDER BY ?name LIMIT 20 OFFSET 19


PREFIX p: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
DESCRIBE ?album
WHERE {
   ?album p:artist
<http://dbpedia.org/resource/The_Allman_Brothers_Band>.
   ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>.                                                                    34
 }
                                                                                                           http://kmi.tugraz.at

WS 08/09                           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




SPARQL
                                                                 Triple Stores


                                                                 SPARQL



Summary                                                          Information Retrieval vs.
                                                                 Semantic Retrieval




   Similar to SQL
   Allows easier expression of joins without knowing the
   underlying database schema
         y g
   Allows to return not only tables, but also more complexe
   output formats like graphs etc.
   Datatypes of a variable not always clear


   http://www.w3.org/TR/rdf-sparql-query/
   http://thefigtrees.net/lee/sw/sparql-faq
   htt //th fi t        t/l / /       l f
   Hitzler, Krötsch, Rudolph, Sure, Semantic Web –                           35
   Grundlagen,
   Grundlagen Chapter 7
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                 Frameworks




Semantic vs. Information Retrieval
                                                                 Triple Stores


                                                                 SPARQL



Overview                                                         Information Retrieval vs.
                                                                 Semantic Retrieval




   Central Question: What is semantic retrieval?
   Define information retrieval
   Wh
   Where i semantic missing?
         is     ti   i i ?
   How can we use Semantic Web technologie to increase
   semantic?




                                                                             36
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                         Frameworks




Semantic vs. Information Retrieval
                                                                                                         Triple Stores


                                                                                                         SPARQL



Definition of IR                                                                                         Information Retrieval vs.
                                                                                                         Semantic Retrieval




   Salton (1968): „
          (     ) „Information retrieval is a field concerned with the
   structure, analysis, organization, storage, searching, and retrieval of
   information.“

   “Information retrieval ( ) i fi di
   “ f      i       i    l (IR) is finding material (
                                               i l (usually d
                                                        ll documents) of an
                                                                    ) f
   unstructured nature (usually text) that satisfies an information need
   from within large collections (usually stored on computers). “
    Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval,
   Cambridge University Press 2008
                        Press.



   Main focus of IR is how to deal with uncertainty and incomplete
   information
         Representation of documents is ambiguous
         Query formulation is ambiguous and usually incomplete
   “Unstructured” information
   Usually the perfect answer, so far a perfect answer exists, can not be
   delivered
                                                                                                                     37
                                                                                             http://kmi.tugraz.at

WS 08/09                   Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                              Frameworks




Semantic vs. Information Retrieval
                                                                              Triple Stores


                                                                              SPARQL



IR vs Data Retrieval from Rijsbergen 1979
   vs.                                                                        Information Retrieval vs.
                                                                              Semantic Retrieval




                        Data retrieval       Information retrieval

 Matching                Exact match           Partial (best) match

 Inference                Deduction
                          Ded ction                 Induction
                                                    Ind ction

 Model                   Deterministic            Probabilistic

 Classification           Monothetic               Polythetic

 Query language            Artificial                   Natural

 Query specification      Complete                 Incomplete

 Items wanted              Matching
                                  g                 Relevant

 Error response            Sensitive               Insensitive
                                                                                          38
                                                                  http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                  Frameworks




Semantic vs. Information Retrieval
                                                                  Triple Stores


                                                                  SPARQL



IR vs Data Retrieval from Rijsbergen 1979
   vs.                                                            Information Retrieval vs.
                                                                  Semantic Retrieval




   “What is the Gross domestic product of Austria?
    What                                  Austria?”




                                                                              39
                                                      http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                    Frameworks




Semantic vs. Information Retrieval
                                                                    Triple Stores


                                                                    SPARQL



IR vs Data Retrieval from Rijsbergen 1979
   vs.                                                              Information Retrieval vs.
                                                                    Semantic Retrieval




   “What is the Gross domestic product of Austria?
    What                                  Austria?”
   Select GDP from GDP_table where country_name=“Austria”
      € 270.8 bn
   However,
      not all information is available in databases
      Queries are hard to formulate for the average users as well
      as for non domain experts
      The             l  th d     i
      Th more complex the domain and the information need,
                                        d th i f ti     d
      the harder to formulate a correct query




                                                                                40
                                                        http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                            Frameworks




  Semantic vs. Information Retrieval
                                                                                            Triple Stores


                                                                                            SPARQL



  Basic Retrieval Workflow                                                                  Information Retrieval vs.
                                                                                            Semantic Retrieval




                                                                      Retrieval
 Documents                         Document Representation            Model M
     D                                       Dr



                                                                      Ranking
                                                                      Function R
Information Need                            Query
       IN                                     Q


                                                                                                        41
   See also Baeza Yates & Ribeiro Neto, (1999),“Modern Information Retrieval”
                                                                                http://kmi.tugraz.at

  WS 08/09                 Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                          Frameworks




Semantic vs. Information Retrieval
                                                                          Triple Stores


                                                                          SPARQL



The Vector Space Model                                                    Information Retrieval vs.
                                                                          Semantic Retrieval




   Document Representation Dr: Documents are represented as
   bag-of-words (i.e. a set of words)
   Query Q: Query is a set of keywords
   Retrieval Model M:
      Set of words are converted to vectors d and q
      Use different heurisitc to calculate the importance of a word
   Ranking Function R:
      Cosine Si il it C l l t th angle b t
      C i    Similarity: Calculate the l between d and q
                                                     d

                                    d1:= “Boy p y chess”
                                            y plays
                                    d2:= “Boy plays bridge”
                                                                                      42
                                                              http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                    Frameworks




Semantic vs. Information Retrieval
                                                                    Triple Stores


                                                                    SPARQL



An analysis of the vector space model                               Information Retrieval vs.
                                                                    Semantic Retrieval




   Query and documents are represented in terms of their words
   Importance of words depend on their occurrence
   Syntactic matching between documents and queries
    y               g                       q
      No synonyms are considered (e.g. Money == Cash)
      No homonyms are considered (e.g. Apache Web Server)
      No mereonyms are considered (e.g. tire is part of a car)
   No relationships between terms are considered




                                                                                43
                                                        http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                          Frameworks




Semantic vs. Information Retrieval
                                                                                                          Triple Stores


                                                                                                          SPARQL



So where can we include semantic?                                                                         Information Retrieval vs.
                                                                                                          Semantic Retrieval




                                           p                       q   y
   Increase the semantic of the document representation Dr and the query
   Q
       Add metadata (e.g. tags, dublin core etc.)
       Use more sophisticated preprocessing (e g language models word
                                            (e.g.         models,
       sense disambiguation)
       Allow users to express information needs in more detail or estimate
       the context of a user (e.g. specify metadata, profiling)
       Formal representation of DR and Q using semantic web languages
       like OWL
       see Tran, Bloehdorn, Cimiano, Haase (2007), „Expressive Ressource Description for Ontology-Based
       Information Retrieval“
                   Retrieval

   However, if we have a perfect formal representation we still need to
   transform natural language queries to this model for the average user
   Requires a special user interface – not possible for the generic case
   Natural Language Understanding – currently unsolved
                                                                                                                      44
                                                                                              http://kmi.tugraz.at

WS 08/09                  Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                        Frameworks




Semantic vs. Information Retrieval
                                                                                        Triple Stores


                                                                                        SPARQL



So where can we include semantic?                                                       Information Retrieval vs.
                                                                                        Semantic Retrieval




              p
   Document representation

                    C:myDocument.doc


                            Ex:containsConcept
                                                               Ex:Concept

   http://en.wikipedia.org/wiki/ApacheWebServer
   http://en wikipedia org/wiki/ApacheWebServer     rdf:type

              ex:term



        „ Apache“       „ Apache Server“          „ Apache Web Server“



   Query
   Select * WHERE {?x ex:contains
   http://en.wikipedia.org/wiki/ApacheWebServer}                                                    45
                                                                            http://kmi.tugraz.at

WS 08/09                 Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                      Frameworks




    Semantic vs. Information Retrieval
                                                                                                      Triple Stores


                                                                                                      SPARQL



    So where can we include semantic?                                                                 Information Retrieval vs.
                                                                                                      Semantic Retrieval




         Iterative refinement of the information need:
         Keyword Query: „Apache“

         http://en.wikipedia.org/wiki/ApacheWebServer
         http://en.wikipedia.org/wiki/ApacheHelicopter
          ttp //e      ped a o g/    / pac e e copte
         http://en.wikipedia.org/wiki/AmericanNatives
         Select * WHERE {?x ex:containsConcept http://en.wikipedia.org/wiki/ApacheWebServer}



                                 C:myDocument.doc                            C:myDocument2.doc


                                     Ex:containsConcept                  Ex:containsConcept


   p          p       g       p
http://en.wikipedia.org/wiki/ApacheWebServer                     http://en.wikipedia.org/wiki/ApacheTribes

                           ex:term
                                                                                                                  46
                     „ Apache“           „ Apache Server“   „ Apache Web Server“          http://kmi.tugraz.at

    WS 08/09                     Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                      Frameworks




Semantic vs. Information Retrieval
                                                                      Triple Stores


                                                                      SPARQL



So where can we include semantic?                                     Information Retrieval vs.
                                                                      Semantic Retrieval




   Increase the cabability of the retrieval model M and ranking
   function R
      Latent Semantic Indexing/Concept Indexing
      Automatically determine the concepts contain in a
      document set
      Include a-priori knowledge (e.g. Thesaurus, Word Net)
      Learn ranking functions based on a users feedback (e.g. via
      machine learning)
      Use formal knowledge in form of ontologies and reasoning
      capabilities




                                                                                  47
                                                          http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                                  Frameworks




     Semantic vs. Information Retrieval
                                                                                                  Triple Stores


                                                                                                  SPARQL



     So where can we include semantic?                                                            Information Retrieval vs.
                                                                                                  Semantic Retrieval




           Example
                Vector Space Model

                       – D={Apache=0.8, http=0.5, server=0.3}
                         D {Apache 0.8, http 0.5, server 0.3}
                       – Q={Jetty=0.8, java=0.7, web=0.4}
                       – Ranking Value=0.0

                Introduce a „better“ retrieval model by using a domain
                ontology:

                                                                              „Apache“ and „Jetty“
                              Ex:WebServer
                                                                              can be related to each
                                   ex:isA                                     other using the domain
           Ex:ApacheServer                       Ex:JettyServer               ontology
                        ex:term                          ex:term                 Ranking Value > 0            48
„ Apache Web Server“   „ Apache“             „ Jetty Server“       „ Jetty“           http://kmi.tugraz.at

     WS 08/09                     Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                     Frameworks




Semantic vs. Information Retrieval
                                                                     Triple Stores


                                                                     SPARQL



So where can we include semantic?                                    Information Retrieval vs.
                                                                     Semantic Retrieval




   Improve the presentation of results
      Clustering of search results
      Display different facets of the result set
         p y
      Different representation of results
      Display facts instead of documents
   Supports refining the user to define their information need
   Search as iterative approach




                                                                                 49
                                                         http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                Frameworks




Semantic vs. Information Retrieval
                                                                Triple Stores


                                                                SPARQL



So where can we include semantic?                               Information Retrieval vs.
                                                                Semantic Retrieval




                                                                            50
                                                    http://kmi.tugraz.at

WS 08/09       Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                      Frameworks




Semantic vs. Information Retrieval
                                                                      Triple Stores


                                                                      SPARQL



Dimensions of Semantic                                                Information Retrieval vs.
                                                                      Semantic Retrieval




   Semantically structured vs. unstructured document and query
   representation
      Semantic expressiveness increases with increased structure
      Queries are hard to formalize. Support for the average user
      is required
      Labour intensive creation of the document representation
   Extension of the retrieval model
      Runtime complexity of reasoning in case of semanticlly
                   p    y           g                      y
      structured Dr and Q
      Scaleability (also an issue for more complex statistical
      methods)

                                                                                  51
                                                          http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                            Frameworks




Semantic vs. Information Retrieval
                                                                            Triple Stores


                                                                            SPARQL



Dimensions of Semantic                                                      Information Retrieval vs.
                                                                            Semantic Retrieval




                                  p
   Semantic enhancement of result presentation
      „Low hanging fruit“
      Does not require a formalized knowledge base
   Formalized knowledge vs. statistical approaches
      Bottom Up vs. Top Down
      Statistical approaches can provide sophisticated retrieval model
                                                                 model,
      which do not require formalized, modelled knowledge
      Statistical approaches depend on the fact that all required
      information is within the data set and can be extracted
      Statistical models are usually not shareable between systems
      Knowledge base may not model all facts contained in the data set
      (
      (e.g. W dN t d
            WordNet does k      thi
                         know nothing about the Apache Web Server)
                                       b t th A      h W bS         )

                                                                                        52
                                                                http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                    Frameworks




Semantic vs. Information Retrieval
                                                                    Triple Stores


                                                                    SPARQL



Example Freebase www freebase com
        Freebase, www.freebase.com                                  Information Retrieval vs.
                                                                    Semantic Retrieval




   Open database of the worlds information
   Contribution by the community
   Linked with other free resources like Wikipedia
                                             p
   Web API
   Own Query Language: Metaweb Query Language
   Regarding semantic search
      Structural document representation
      Keyword queries in combination with intelligent interfaces
      for infromation need refinement
      Fact based representation

                                                                                53
                                                        http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                               Frameworks




Semantic vs. Information Retrieval
                                                               Triple Stores


                                                               SPARQL



Example Freebase                                               Information Retrieval vs.
                                                               Semantic Retrieval




                                                                           54
                                                   http://kmi.tugraz.at

WS 08/09      Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                                          Frameworks




Semantic vs. Information Retrieval
                                                                                          Triple Stores


                                                                                          SPARQL



Example Cuil www cuil com
        Cuil, www.cuil.com                                                                Information Retrieval vs.
                                                                                          Semantic Retrieval




   Internet search engine with 120 billion pages


   Not based on popularity of sites, just on content and topics of
                p p      y         ,j                      p
   content


   Document representation and query is unstructured


   Extended Retrievalmodel i th b k
   E t d dR t i      l  d l in the background b
                                             d based on
                                                   d
   categorical knowledge and statistical methods (Clustering)
   http://www.news.com.au/technology/story/0,25642,24089734-5014239,00.html




   Enhanced interface
                                                                                                      55
                                                                              http://kmi.tugraz.at

WS 08/09                Wissenstechnologie @ kmi.tugraz.at
Ontology Modelling & SW
                                                                     Frameworks




Semantic vs. Information Retrieval
                                                                     Triple Stores


                                                                     SPARQL



Example Yahoo! Search Monkey                                         Information Retrieval vs.
                                                                     Semantic Retrieval




   Use structured data to improve presentation of search results
   http://developer.yahoo.com/searchmonkey/#




                                                                                 56
                                                         http://kmi.tugraz.at

WS 08/09         Wissenstechnologie @ kmi.tugraz.at
Summary


    Triple Store
        Generic /Flexibility vs. Performance/Space
        Usually RDBMS with a large triple table
        Alternative: special graph indexing structures
    SPARQL
        Simple query for RDF by providing a graph pattern
        Recommended by W3c
    Semantic Search
        Document & query representation
        Retrieval model & Interfaces
        Semantic vs. Statistic
                                                                             57
        It is the next step, but not such a big one
                                                            http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at
Next Week


    Guest Lectures (Anwesenheitspflicht) from experts
       Werner Klieber: Semantic Web Services (30‘)
       A tif L tif Open Linked Data (30‘)
       Aatif Latif: O   Li k d D t
       Fleur Jeanquartier: Bringing the Semantic Web closer
       to the User (30‘)
                   (30 )




                                                                      58
                                                     http://kmi.tugraz.at

WS 08/09        Wissenstechnologie @ kmi.tugraz.at
That‘s it for today…


Thanks for your attention


Questions/comments?


mgranitzer@tugraz.at




                                                                    59
                                                   http://kmi.tugraz.at

WS 08/09      Wissenstechnologie @ kmi.tugraz.at
License


   This work is licensed under the Creative Commons
   Attribution 2.0 Austria License.
   To view a copy of this license, visit
   http://creativecommons org/licenses/by/2 0/at/
   http://creativecommons.org/licenses/by/2.0/at/.


   Contributors:
      Mathias Lux
      Peter Scheir
      Klaus Tochtermann
      Michael Granitzer
                                                                         60
                                                        http://kmi.tugraz.at

WS 08/09           Wissenstechnologie @ kmi.tugraz.at

More Related Content

Viewers also liked

Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...BASF
 
اسهامات العلماء في تطوير الطاقـه ومصادرها
اسهامات العلماء في تطوير الطاقـه ومصادرهااسهامات العلماء في تطوير الطاقـه ومصادرها
اسهامات العلماء في تطوير الطاقـه ومصادرهاMero Cool
 
Embase indexing (webinar 28 oct)
Embase indexing (webinar 28 oct)Embase indexing (webinar 28 oct)
Embase indexing (webinar 28 oct)Ann-Marie Roche
 
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتين
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتينابحث عن نص وصفي ثم انسخ منه فقرة او فقرتين
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتينMero Cool
 

Viewers also liked (12)

Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...
Laromer® UA 9089 – A New, Flexible, Weather-Resistant, Monomer-Free Urethane ...
 
Sources for the Historical Development of Monepiscopacy
Sources for the Historical Development of MonepiscopacySources for the Historical Development of Monepiscopacy
Sources for the Historical Development of Monepiscopacy
 
اسهامات العلماء في تطوير الطاقـه ومصادرها
اسهامات العلماء في تطوير الطاقـه ومصادرهااسهامات العلماء في تطوير الطاقـه ومصادرها
اسهامات العلماء في تطوير الطاقـه ومصادرها
 
Embase indexing (webinar 28 oct)
Embase indexing (webinar 28 oct)Embase indexing (webinar 28 oct)
Embase indexing (webinar 28 oct)
 
Monoclinal antibodies and gene therapt 1
Monoclinal antibodies and gene therapt 1Monoclinal antibodies and gene therapt 1
Monoclinal antibodies and gene therapt 1
 
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتين
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتينابحث عن نص وصفي ثم انسخ منه فقرة او فقرتين
ابحث عن نص وصفي ثم انسخ منه فقرة او فقرتين
 
Problemas de disoluciones
Problemas de disolucionesProblemas de disoluciones
Problemas de disoluciones
 
Mercerization
MercerizationMercerization
Mercerization
 
Pompeya ciudad romana
Pompeya ciudad romanaPompeya ciudad romana
Pompeya ciudad romana
 

Similar to Wissenstechnologie Vi 08 09

Conceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataConceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataJim McCusker
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with SolrTravis Carlson
 
Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Databaseijbuiiir1
 
Gist od2-feb-2011
Gist od2-feb-2011Gist od2-feb-2011
Gist od2-feb-2011ianibbo
 
Model driven retrieval of model repositories
Model driven retrieval of model repositoriesModel driven retrieval of model repositories
Model driven retrieval of model repositoriesMarco Brambilla
 
Using SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and MetadataUsing SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and Metadatamanuelso
 
Wissenstechnologie Ii 08 09.Voll
Wissenstechnologie Ii 08 09.VollWissenstechnologie Ii 08 09.Voll
Wissenstechnologie Ii 08 09.Vollmgrani
 
Speculating on the Future of the Metadata Standards Landscape
Speculating on the Future of the Metadata Standards LandscapeSpeculating on the Future of the Metadata Standards Landscape
Speculating on the Future of the Metadata Standards LandscapeJenn Riley
 
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...Philip Schwarz
 
Newsletter Infographics (8).pdf
Newsletter Infographics (8).pdfNewsletter Infographics (8).pdf
Newsletter Infographics (8).pdfFiza987241
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsStavros Kontopoulos
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoningWilliam Smith
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsBen DeMott
 
DynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesDynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesOscar Corcho
 
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...eMadrid network
 
Oracle Labs - research mission & project potfolio
Oracle Labs - research mission & project potfolioOracle Labs - research mission & project potfolio
Oracle Labs - research mission & project potfolioJohan Louwers
 

Similar to Wissenstechnologie Vi 08 09 (20)

Conceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataConceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical Data
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with Solr
 
Reference Ontology Presentation
Reference Ontology PresentationReference Ontology Presentation
Reference Ontology Presentation
 
Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Database
 
Gist od2-feb-2011
Gist od2-feb-2011Gist od2-feb-2011
Gist od2-feb-2011
 
Model driven retrieval of model repositories
Model driven retrieval of model repositoriesModel driven retrieval of model repositories
Model driven retrieval of model repositories
 
Using SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and MetadataUsing SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and Metadata
 
Role of Semantic Web in Health Informatics
Role of Semantic Web in Health InformaticsRole of Semantic Web in Health Informatics
Role of Semantic Web in Health Informatics
 
Wissenstechnologie Ii 08 09.Voll
Wissenstechnologie Ii 08 09.VollWissenstechnologie Ii 08 09.Voll
Wissenstechnologie Ii 08 09.Voll
 
Speculating on the Future of the Metadata Standards Landscape
Speculating on the Future of the Metadata Standards LandscapeSpeculating on the Future of the Metadata Standards Landscape
Speculating on the Future of the Metadata Standards Landscape
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
 
Newsletter Infographics (8).pdf
Newsletter Infographics (8).pdfNewsletter Infographics (8).pdf
Newsletter Infographics (8).pdf
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementations
 
Neo4j_allHands_04112013
Neo4j_allHands_04112013Neo4j_allHands_04112013
Neo4j_allHands_04112013
 
DynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesDynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniques
 
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...
2012 01 20 (upm) emadrid ocorcho upm dynalearn tecnologias semanticas en cont...
 
Oracle Labs - research mission & project potfolio
Oracle Labs - research mission & project potfolioOracle Labs - research mission & project potfolio
Oracle Labs - research mission & project potfolio
 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 

Wissenstechnologie Vi 08 09

  • 1. Wissenstechnologie WS 08/09 Michael Granitzer IWM TU Graz & Know-Center Know Center Lecture 6: T iple Sto es Sparql, Lect e 6 Triple Stores, Spa ql Semantic Retrieval http://kmi.tugraz.at http://kmi tugraz at http://www.know-center.at http://www know center at This work is licensed under the Creative Commons Attribution 2.0 Austria License. To view a copy of this license, visit http://creativecommons.org/licenses/by/2.0/at/.
  • 2. Today Ontology Modelling & SW Frameworks Triple Stores • Basic RDBMS scheme • Property tables & vertical Partitioning • Performance Comparisons SPARQL • Definition • Simplex & Complex Queries • Some examples on Endpoints vs. Information Retrieval vs Semantic Retrieval • Basics of IR • „Semantic“ Retrieval • Practical Examples (Freebase, Cugil etc.) 2 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 3. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL vs. vs OOP Information Retrieval vs. Semantic Retrieval Similar to design in Object Oriented Programming Classes, objects and members Capture th operational properties C t the ti l ti public interface Course { public void enroll() bli id ll() } Ontology Modelling: Capture the structural properties owl:participates owl:Course owl:Student 3 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 4. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL vs. vs RDBMS Information Retrieval vs. Semantic Retrieval Similar in designing a database system Higher expressiveness in OWL Aggrement on the domain not only referential integrity Not focused on special indexing structures or on querying only Ontologies should be application independent Consistency checks Semantic Integration via Ontologies 4 Product File System Employee Text ... Database Database Database http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 5. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL Goals Information Retrieval vs. Semantic Retrieval Goals Share common understanding among people or software Enable reuse of knowledge Make domain assumptions explicit Separate domain knowledge from operational knowledge Analyze domain knowledge Main Application Areas Semantic harmonization of heterogeneous data sources Structuring the content of a portal Enhance search and retrieval 5 … http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 6. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL Aspects to model Information Retrieval vs. Semantic Retrieval Defining classes in the ontology Arranging classes in a taxonomy g g y Defining slots/properties for classes and their values Define logical constraints on classes/properties Assign instances 6 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 7. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL Three simple rules Information Retrieval vs. Semantic Retrieval 1. There 1 “There is no one correct way to model a domain there domain— are always viable alternatives. The best solution almost always depends on the application that you have in mind and the extensions that you anticipate ” anticipate. 2. “Concepts in the ontology should be close to objects (physical or logical) and relationships in your domain of interest. These are most likely to be nouns (objects) or verbs (relationships) in sentences that describe your domain. domain ” 3. “Ontology development is necessarily an iterative process process” 7 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 8. Ontology Modelling & SW Frameworks Ontology Modelling Triple Stores SPARQL Noy s Noy‘s and McGunnise 7 Steps Information Retrieval vs. Semantic Retrieval 1. 1 Determine the domain and scope of the ontology 2. Consider reusing existing ontologies 3 E 3. Enumerate i t important terms in the ontology t tt i th t l 4. Define the classes and the class hierarchy 5. Define the properties (slots) of classes 6. Define the facets of the slots 7. Create/Import instances 8 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 9. Ontology Modelling & SW Frameworks Semantic Web Frameworks Triple Stores SPARQL Motivation Information Retrieval vs. Semantic Retrieval Protege as modelling GUI For „Semantic Web Applications“ we want also to A t Automatically i ti ll import/map i t t/ instances Manage large number of triples Combine different schemas Query for specific triples Harmonize different metadata schemas Database requirements for graphs Reasoning 9 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 10. Ontology Modelling & SW Frameworks Semantic Web Frameworks Triple Stores SPARQL Overview Information Retrieval vs. Semantic Retrieval Three „major Java based Open Source frameworks „major“ Jena Sesame Protege Java API Functionality Java API for managing OWL, RDF and RDFS (optional DAML+OIL) Import/Export of different formats Persistence via own data store, different database and file system backend Querying, Graph manipulation and restricted reasoning capabilities Web API 10 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 11. Ontology Modelling & SW Frameworks Semantic Web Frameworks Triple Stores SPARQL Jena Architecture Information Retrieval vs. Semantic Retrieval SPARQL RDF/XML Jena: Implementing the Semantic Web Recommendations – 2003 http://www.hpl.hp.com/techreports/2003/H 11 PL-2003-146.html http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 12. Ontology Modelling & SW Frameworks Semantic Web Frameworks Triple Stores SPARQL Main Differences Information Retrieval vs. Semantic Retrieval Jena Reference implementation Not directly focused towards web access and scalability Protege Modelling GUI Sesame Focused towards remote access and scaleability Flexible Layer architecture for different storage backends Others: Virtuoso 3Store Kowari OpenAnzo Virtuoso, 3Store, Kowari, 12 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 13. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Overview Information Retrieval vs. Semantic Retrieval Basic data model is RDF (i e OWL, RDFS) (i.e. OWL RDF forms an directed graph How do we manage large graphs In Memory Adjacency Matrix On secondary storage – Special Indices – Use relational database management systems 13 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 14. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL „Normalized Table Model of RDF Normalized“ Information Retrieval vs. Semantic Retrieval Subject Predicate Object http://book.at/isbn123 author http://fussball.de/G. Müller http://book.at/isbn123 price €15 http://book.at/isbn123 Title Ein Leben für die Tore http://fussball.de/G. Müller Name Gerd Müller author http://book.at/isbn123 http://fussball.de/G. Müller name price title 14 Ein Leben für die Tore Gerd Müller €15 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 15. Triple Stores Query in an unoptimized RDBMS Select r3.o as Title from rdf Query: Titles of books from the personwith name where r1, rdf r2, rdf r3 Gerd Müller? r1.s = r2.o AND R2.s = r3.s AND r1.o = ‘Gerd Müller’ AND Subject (s) Predicate Object (o) r1.p = ‘Name’ AND (p) r2.p = ‘author’ AND http://book.at/isbn123 author http://fussball.de/G. Müller R3.p = ‘Title’ p http://book.at/isbn123 price €15 http://book.at/isbn123 Title Ein Leben für die Tore http://fussball.de/G. http://fussball de/G Name Gerd Müller Müller 15 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 16. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL The Sesame Mapping as example Information Retrieval vs. Semantic Retrieval y See Hak Soo Kim, Hyun Seok Cha, Jungsun Kim, Jin Hyun Son,, Development of the Efficient OWL Document Management g y p g 16 System for the Embedded Applications, Springer 2005, http://www.springerlink.com/content/8mfxeh0glq5xj00m/ http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 17. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Indexing Techniques Information Retrieval vs. Semantic Retrieval Use specialised indices for graphs Bitmap indices in Virtuoso http://virtuoso.openlinksw.com/wiki/main/Main/VOSBitmapIndexing Index different combinations of the S,P,O Table P,S,O O,P,S O,S,P S,O,P 17 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 18. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL A first Analysis Information Retrieval vs. Semantic Retrieval Normalised view on a graph: one large table Generic and flexible, but Large self j i t f rather simple queries. RDBMS are L lf joints for th i l i usually not optimized for this Large memory overhead in query processing due to self joints Requires lot of index lookups and/or full table scans Large storage overhead In I general: fl ibilit vs. performance l flexibility f How to improve? 18 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 19. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Further improvements Information Retrieval vs. Semantic Retrieval p y p y g Property tables: flattened representation by finding sets of properties which are used together Subject-Property Matrix Materialized Join Views (SPMJVs) from Oracle Ch Chong, E. I., Das, S., E d E I D S Eadon, G and S i i G., J. 2005 A ffi i t SQL b d d Srinivasan, J 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st international Conference on Very Large Data Bases, ACM 19 Abadi, D. J., Marcus, A., Madden, S. R., and Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international Conference on Very Large Data Bases (Vienna, Austria, September 23 - 27, 2007). Very Large Data Bases. VLDB Endowment, 411-422. http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 20. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Property Tables Information Retrieval vs. Semantic Retrieval ++: Faster querying within a property tables due to reducing subject-subject self joins --: Requires intelligent selection of the properties in the table More property colums lead to more null values in the table and therefore to larger space overhead Lesser property colums lead to more property tables more joins over lesser property tables --: Multi valued properties are hard to manage (e.g. a book has : (e g several authors) Subject Title Author Year ID1 “Intro to RDF” Granitzer 2006 ID1 “Intro to RDF” Tochtermann 2006 20 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 21. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Vertical Partitioning Information Retrieval vs. Semantic Retrieval Partition database according to properties – one table per property Abadi, D. J., Marcus, A., Madden, S. R., and Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international Conference on Very Large Data Bases (Vienna, Austria, September 23 - 27, 2007). Very Large Data Bases. VLDB Endowment, 411-422. Tables are sorted by subject allows fast merge sort joins 21 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 22. Ontology Modelling & SW Frameworks Triple Stores Triple Stores SPARQL Vertical Partitioning Information Retrieval vs. Semantic Retrieval ++: Use of simple fast merge joints simple, ++: Multi valued attributes are supported ++: No a-priori clustering decision is necessary a priori ++: Smaller tables. Only those properties accessed have to be read from disk --: Insert may be slower due to access to multiple tables --: Queries over multiple properties span over multiple tables 22 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 23. Ontology Modelling & SW Frameworks Triple Store Triple Stores SPARQL Performance of Open Source Solutions Information Retrieval vs. Semantic Retrieval Portwin & Parvatikar (2006) Scaling Jena in a Commercial Environment: The Ingenta MetaStore Project LEGHIGH Dataset with domain universities ~200 million triples, 11 Millionen OWL Statements, 4.3 millionen documents Kowari: 1 billion triple, load 20k Triple/s for Wikipedia data set Unoptimized Simple query take milliseconds With inference queries take several seconds to minutes depending on the complexity Optimization for Inference: for RDFS entailment is to expand the graph by making implicit edges explicit more storage but faster access 23 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 24. Ontology Modelling & SW Frameworks Triple Store Triple Stores SPARQL Performance of Oracle Information Retrieval vs. Semantic Retrieval BioMed literature database (UniProt data set) 80 million triples 5 d t ( 25 T i l 17 M i ~5 GB RDF/XML data (~2,5 GB Triple; 1,7 GB Mapping; 4,8 GB Indices) Queries take milliseconds to secondes Subject-property matrix materialized views provide optimization potential of roughly ~30% 30% Chong, E. I., Das, S., Eadon, G., and Srinivasan, J. 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st international Conference on Very Large Data Bases, ACM 24 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 25. Ontology Modelling & SW Frameworks Triple Store Triple Stores SPARQL Performance Summary Information Retrieval vs. Semantic Retrieval http://esw.w3.org/topic/LargeTripleStores Problem: Comparison among performance numbers available Trade-off Generic vs Performance Trade off vs. Optimization potential is available Currently not as fast as specialised RDBMS but more flexible RDBMS, 25 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 26. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL SPARQL Protocol and RDF Query Language Information Retrieval vs. Semantic Retrieval Different languages similar to SQL in RDBMS SerQL, RDF, SPARQL SPARQL currently proposed recommendation of the W3C But what does querying a graph mean? Basically Specify a sub-graph with variable nodes Find all patterns in the graph matching the sub-graph ? author Gerd Müller Select ?x, ?y where ?x <author> “Gerd Müller”. title ?x <title> ?y. ?y ? 26 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 27. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Example Information Retrieval vs. Semantic Retrieval Daten: http://example.org/book/book1 http://purl.org/dc/elements/1.1/title quot;SPARQL Tutorialquot; . Abfrage: SELECT ?title WHERE { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title . } Ergebnis: title quot;SPARQL Tutorialquot; 27 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 28. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Example Information Retrieval vs. Semantic Retrieval Data: @prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name quot;Johnny Lee Outlawquot; . _:a foaf:mbox <mailto:jlow@example.com> . _:b foaf:name quot;Peter Goodguyquot; . :b _:b foaf:mbox <mailto:peter@example org> . <mailto:peter@example.org> Query: PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?name ?mbox WHERE { ?x foaf:name ?name . ?x foaf:mbox ?mbox } Result: Res lt name mbox quot;Johnny Lee Outlawquot; mailto:jlow@example.com quot;Peter Goodguyquot; <mailto:peter@example.org> 28 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 29. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Simple Query Elements Information Retrieval vs. Semantic Retrieval Determine the Namespace: PREFIX Determine the return format SELECT: Table output format similar to SQL Results CONSTRUCT: Allows to construct a graph as return value ASK: Returns only true/false depending of the result exists or not DESCRIBE: return possible properties/ressources for a particular query. Used for browsing. Specify the selection criteria with the WHERE Clause Specify a non-recursive sub-pattern with triples and placeholders (? Or $) Perform Grouping and Filter Operations Modifiers: ORDER BY, LIMIT, OFFSET, DISTINCT 29 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 30. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Blank Nodes Information Retrieval vs. Semantic Retrieval ID of Blank Nodes is unique within one query and indicate only the existence of a blank node not it‘s absolute value Blank nodes are identified by an automatically generated URI Consider the results of a query ≡ ≠ Subject Value Subject Value Subject Value _:a a “zum” “ m” _:x “zum” “ ” _:z “zum” _:b “Beispiel” _:y “Beispiel” _:z “Beispiel” Blank nodes may be renamed and are structural elements only 30 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 31. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Complex Queries Information Retrieval vs. Semantic Retrieval Combination of groups of simple graph expressions in the WHERE clause OPTIONAL clause: Subgraph pattern may not exist Example for querying book titles from Springer if an author exists, it will be listed if not the title is returned without a author SELECT ?title ?author WHERE { ?buch ex:pulishedFrom http://springer.com/Verlag . p p // p g / g ? Buch ex:Title ?title . OPTIONAL {?buch ex:Autor ?author }. } 31 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 32. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Complex Queries Information Retrieval vs. Semantic Retrieval Specifying alternative sub graph patterns: UNION Logical OR or union of two separat queries SELECT ?title ?author WHERE { ?buch ex:pulishedFrom http://springer.com/Verlag . ? Buch ex:Title ?title . {?buch {?b h ex:Autor ? th .} UNION A t ?author } {?buch ex:Creator ?author .} } „Select all books with a title published by Springer which have an author or an creator assigned“ Note: ?author in the different groups are independent of each other 32 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 33. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Complex Queries Information Retrieval vs. Semantic Retrieval Considering special datatypes: FILTER and XML Datatypes Specify the data type of a literal SELECT ?title ?author WHERE { ?buch ex:pulishedFrom http://springer.com/Verlag . ? Buch ex:Title ?title . „1998 xsd:integer ?buch ex:publishedIn „1998“^^xsd:integer } FILTER specifies boolean expressions for filtering results E.g. Specify the data type range using FILTER (see Chapter 7 in Semantic Web Grundlagen) SELECT ?title ?author WHERE { ?buch ex:pulishedFrom http://springer.com/Verlag . ?buch ex:publishedIn ?year . FILTER(?year >2000) 33 } http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 34. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Real World Examples in DBPedia Information Retrieval vs. Semantic Retrieval PREFIX p: <http://dbpedia.org/property/> PREFIX rdf: <http://www.w3.org/1999/02/22 rdf syntax ns#> <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT * WHERE { ?album p:artist ?band. ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>. OPTIONAL {?album p:cover ?cover}. OPTIONAL {?album p:name ?name}. { p } PREFIX p: <http://dbpedia.org/property/> OPTIONAL {?album p:released ?dateofrelease}. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> CONSTRUCT { } ORDER BY DESC(?name) LIMIT 20 OFFSET 19 ?album p:itIsDone ?dateofrelease . ?band p:isBand quot;truequot; . } WHERE { ?album p:artist ?band. ?band ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>. OPTIONAL {?album p:cover ?cover}. OPTIONAL {?album p:name ?name}. OPTIONAL {?album p:released ?dateofrelease}. } ORDER BY ?name LIMIT 20 OFFSET 19 PREFIX p: <http://dbpedia.org/property/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> DESCRIBE ?album WHERE { ?album p:artist <http://dbpedia.org/resource/The_Allman_Brothers_Band>. ?album rdf:type <http://dbpedia.org/class/yago/Album106591815>. 34 } http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 35. Ontology Modelling & SW Frameworks SPARQL Triple Stores SPARQL Summary Information Retrieval vs. Semantic Retrieval Similar to SQL Allows easier expression of joins without knowing the underlying database schema y g Allows to return not only tables, but also more complexe output formats like graphs etc. Datatypes of a variable not always clear http://www.w3.org/TR/rdf-sparql-query/ http://thefigtrees.net/lee/sw/sparql-faq htt //th fi t t/l / / l f Hitzler, Krötsch, Rudolph, Sure, Semantic Web – 35 Grundlagen, Grundlagen Chapter 7 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 36. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Overview Information Retrieval vs. Semantic Retrieval Central Question: What is semantic retrieval? Define information retrieval Wh Where i semantic missing? is ti i i ? How can we use Semantic Web technologie to increase semantic? 36 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 37. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Definition of IR Information Retrieval vs. Semantic Retrieval Salton (1968): „ ( ) „Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.“ “Information retrieval ( ) i fi di “ f i i l (IR) is finding material ( i l (usually d ll documents) of an ) f unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers). “ Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press 2008 Press. Main focus of IR is how to deal with uncertainty and incomplete information Representation of documents is ambiguous Query formulation is ambiguous and usually incomplete “Unstructured” information Usually the perfect answer, so far a perfect answer exists, can not be delivered 37 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 38. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL IR vs Data Retrieval from Rijsbergen 1979 vs. Information Retrieval vs. Semantic Retrieval Data retrieval Information retrieval Matching Exact match Partial (best) match Inference Deduction Ded ction Induction Ind ction Model Deterministic Probabilistic Classification Monothetic Polythetic Query language Artificial Natural Query specification Complete Incomplete Items wanted Matching g Relevant Error response Sensitive Insensitive 38 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 39. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL IR vs Data Retrieval from Rijsbergen 1979 vs. Information Retrieval vs. Semantic Retrieval “What is the Gross domestic product of Austria? What Austria?” 39 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 40. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL IR vs Data Retrieval from Rijsbergen 1979 vs. Information Retrieval vs. Semantic Retrieval “What is the Gross domestic product of Austria? What Austria?” Select GDP from GDP_table where country_name=“Austria” € 270.8 bn However, not all information is available in databases Queries are hard to formulate for the average users as well as for non domain experts The l th d i Th more complex the domain and the information need, d th i f ti d the harder to formulate a correct query 40 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 41. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Basic Retrieval Workflow Information Retrieval vs. Semantic Retrieval Retrieval Documents Document Representation Model M D Dr Ranking Function R Information Need Query IN Q 41 See also Baeza Yates & Ribeiro Neto, (1999),“Modern Information Retrieval” http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 42. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL The Vector Space Model Information Retrieval vs. Semantic Retrieval Document Representation Dr: Documents are represented as bag-of-words (i.e. a set of words) Query Q: Query is a set of keywords Retrieval Model M: Set of words are converted to vectors d and q Use different heurisitc to calculate the importance of a word Ranking Function R: Cosine Si il it C l l t th angle b t C i Similarity: Calculate the l between d and q d d1:= “Boy p y chess” y plays d2:= “Boy plays bridge” 42 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 43. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL An analysis of the vector space model Information Retrieval vs. Semantic Retrieval Query and documents are represented in terms of their words Importance of words depend on their occurrence Syntactic matching between documents and queries y g q No synonyms are considered (e.g. Money == Cash) No homonyms are considered (e.g. Apache Web Server) No mereonyms are considered (e.g. tire is part of a car) No relationships between terms are considered 43 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 44. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval p q y Increase the semantic of the document representation Dr and the query Q Add metadata (e.g. tags, dublin core etc.) Use more sophisticated preprocessing (e g language models word (e.g. models, sense disambiguation) Allow users to express information needs in more detail or estimate the context of a user (e.g. specify metadata, profiling) Formal representation of DR and Q using semantic web languages like OWL see Tran, Bloehdorn, Cimiano, Haase (2007), „Expressive Ressource Description for Ontology-Based Information Retrieval“ Retrieval However, if we have a perfect formal representation we still need to transform natural language queries to this model for the average user Requires a special user interface – not possible for the generic case Natural Language Understanding – currently unsolved 44 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 45. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval p Document representation C:myDocument.doc Ex:containsConcept Ex:Concept http://en.wikipedia.org/wiki/ApacheWebServer http://en wikipedia org/wiki/ApacheWebServer rdf:type ex:term „ Apache“ „ Apache Server“ „ Apache Web Server“ Query Select * WHERE {?x ex:contains http://en.wikipedia.org/wiki/ApacheWebServer} 45 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 46. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval Iterative refinement of the information need: Keyword Query: „Apache“ http://en.wikipedia.org/wiki/ApacheWebServer http://en.wikipedia.org/wiki/ApacheHelicopter ttp //e ped a o g/ / pac e e copte http://en.wikipedia.org/wiki/AmericanNatives Select * WHERE {?x ex:containsConcept http://en.wikipedia.org/wiki/ApacheWebServer} C:myDocument.doc C:myDocument2.doc Ex:containsConcept Ex:containsConcept p p g p http://en.wikipedia.org/wiki/ApacheWebServer http://en.wikipedia.org/wiki/ApacheTribes ex:term 46 „ Apache“ „ Apache Server“ „ Apache Web Server“ http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 47. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval Increase the cabability of the retrieval model M and ranking function R Latent Semantic Indexing/Concept Indexing Automatically determine the concepts contain in a document set Include a-priori knowledge (e.g. Thesaurus, Word Net) Learn ranking functions based on a users feedback (e.g. via machine learning) Use formal knowledge in form of ontologies and reasoning capabilities 47 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 48. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval Example Vector Space Model – D={Apache=0.8, http=0.5, server=0.3} D {Apache 0.8, http 0.5, server 0.3} – Q={Jetty=0.8, java=0.7, web=0.4} – Ranking Value=0.0 Introduce a „better“ retrieval model by using a domain ontology: „Apache“ and „Jetty“ Ex:WebServer can be related to each ex:isA other using the domain Ex:ApacheServer Ex:JettyServer ontology ex:term ex:term Ranking Value > 0 48 „ Apache Web Server“ „ Apache“ „ Jetty Server“ „ Jetty“ http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 49. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval Improve the presentation of results Clustering of search results Display different facets of the result set p y Different representation of results Display facts instead of documents Supports refining the user to define their information need Search as iterative approach 49 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 50. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL So where can we include semantic? Information Retrieval vs. Semantic Retrieval 50 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 51. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Dimensions of Semantic Information Retrieval vs. Semantic Retrieval Semantically structured vs. unstructured document and query representation Semantic expressiveness increases with increased structure Queries are hard to formalize. Support for the average user is required Labour intensive creation of the document representation Extension of the retrieval model Runtime complexity of reasoning in case of semanticlly p y g y structured Dr and Q Scaleability (also an issue for more complex statistical methods) 51 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 52. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Dimensions of Semantic Information Retrieval vs. Semantic Retrieval p Semantic enhancement of result presentation „Low hanging fruit“ Does not require a formalized knowledge base Formalized knowledge vs. statistical approaches Bottom Up vs. Top Down Statistical approaches can provide sophisticated retrieval model model, which do not require formalized, modelled knowledge Statistical approaches depend on the fact that all required information is within the data set and can be extracted Statistical models are usually not shareable between systems Knowledge base may not model all facts contained in the data set ( (e.g. W dN t d WordNet does k thi know nothing about the Apache Web Server) b t th A h W bS ) 52 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 53. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Example Freebase www freebase com Freebase, www.freebase.com Information Retrieval vs. Semantic Retrieval Open database of the worlds information Contribution by the community Linked with other free resources like Wikipedia p Web API Own Query Language: Metaweb Query Language Regarding semantic search Structural document representation Keyword queries in combination with intelligent interfaces for infromation need refinement Fact based representation 53 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 54. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Example Freebase Information Retrieval vs. Semantic Retrieval 54 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 55. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Example Cuil www cuil com Cuil, www.cuil.com Information Retrieval vs. Semantic Retrieval Internet search engine with 120 billion pages Not based on popularity of sites, just on content and topics of p p y ,j p content Document representation and query is unstructured Extended Retrievalmodel i th b k E t d dR t i l d l in the background b d based on d categorical knowledge and statistical methods (Clustering) http://www.news.com.au/technology/story/0,25642,24089734-5014239,00.html Enhanced interface 55 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 56. Ontology Modelling & SW Frameworks Semantic vs. Information Retrieval Triple Stores SPARQL Example Yahoo! Search Monkey Information Retrieval vs. Semantic Retrieval Use structured data to improve presentation of search results http://developer.yahoo.com/searchmonkey/# 56 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 57. Summary Triple Store Generic /Flexibility vs. Performance/Space Usually RDBMS with a large triple table Alternative: special graph indexing structures SPARQL Simple query for RDF by providing a graph pattern Recommended by W3c Semantic Search Document & query representation Retrieval model & Interfaces Semantic vs. Statistic 57 It is the next step, but not such a big one http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 58. Next Week Guest Lectures (Anwesenheitspflicht) from experts Werner Klieber: Semantic Web Services (30‘) A tif L tif Open Linked Data (30‘) Aatif Latif: O Li k d D t Fleur Jeanquartier: Bringing the Semantic Web closer to the User (30‘) (30 ) 58 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 59. That‘s it for today… Thanks for your attention Questions/comments? mgranitzer@tugraz.at 59 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at
  • 60. License This work is licensed under the Creative Commons Attribution 2.0 Austria License. To view a copy of this license, visit http://creativecommons org/licenses/by/2 0/at/ http://creativecommons.org/licenses/by/2.0/at/. Contributors: Mathias Lux Peter Scheir Klaus Tochtermann Michael Granitzer 60 http://kmi.tugraz.at WS 08/09 Wissenstechnologie @ kmi.tugraz.at