PoolParty für
semantische Suche und
Vokabular Management für CKAN




                 Mag. Thomas Schandl
                Semantic Web Company
Agenda



• Live Demo PoolParty Semantic Search

• Szenarios für semantische Suche

• Die Rolle von Thesauri bei semantischer Suche

• PoolParty Demo am Beispiel OpenData und
  CKAN


            2
OGD/CKAN Herausforderungen


• Wo suchen?
 Verteilte nationale und internationale
 Datenbestände


• Welche Suchbegriffe verwenden?
 Uneinheitliche Metadaten und Verschlagwortung,
 verschiedene Sprachen und Begrifflichkeiten


• Verschiedene andere Katalogsysteme

        © Semantic Web Company – http://www.semantic-web.at/   3
Some thoughts on the Semantic Web

                   “In the Semantic Web, it
                     is not the Semantic
                     which is new, it is the
                     Web which is new”

Dr. Chris Welty, IBM
Watson Research Center




          4
Some thoughts on the Semantic Web


                         “A little Semantics
                           Goes a Long Way”


Prof. Jim Hendler
Rensselaer Polytechnic Institute




         5
PoolParty Überblick

• Hauptanwendungsgebiete:
   – SKOS Thesaurus Management
   – Linked Data (publishing & consuming)
   – Semantic Search & Semantic Indexing


• Verbindung CKAN und PoolParty in LOD 2 Projekt




             6
Semantic Search Demo
http://bit.ly/semantic_search




                                                          http://www.flickr.com/photos/techburst/2796421248/
                         Semantic search has many faces

          7
Weitere Semantic Search Szenarios




                                                   http://www.flickr.com/photos/techburst/2796421248/
                  Semantic search has many faces

       8
Situations in which semantic search
                     can help


                     I want to see facts
                       from different
                     sources describing                      I can´t
                         this entity.                    remember how
  I want to know                                           to spell the
 more about this                                           search term
     entity in a
 certain context.                                                    I want to search
                                                                        in different
                                                                         languages
  I want to gain                                                      simultaneously
    background
 knowledge to a
certain document
                                                                      I can´t
                                                                    remember
    I want the                                                    exactly what I
   software to                        I forgot some of            was looking for
 understand what I                      the names for
 mean by „Jaguar“                      the entity I´m
                                          looking for
                     9
Find information faster – Auto-
                  Complete

                                                                   I can´t
                                                               remember how
                                                                 to spell the
                                                                 search term




To provide powerful auto-complete also for enterprise search
scenarios you need to establish an enterprise vocabulary.


                 10
Reveal hidden information – Status
        quo
                                       I forgot some of
                                         the names for
                                        the entity I´m
SNCR                 Search                looking for




                          SNCR OR „Selective non-         Search




       11
Reveal hidden information with
                    query expansion

SNCR OR "selective non catalytic reduction"   Search



                                                                   SNCR

                                                           preferred Label




                                                              alternative Label



                                                       selective non
                                                       catalytic reduction


                   12
Multi-lingual search based on a
                   thesaurus
                                             I want to search
                                                in different
                                                 languages
                                              simultaneously
clean energy OR energía limpia     Search




                                                    preferred Label @en

                                                    clean energy




                                             energía limpia
                                             preferredLabel @es



                  13
Reveal hidden information and
        relations
  I want to gain
    background
 knowledge to a
certain document




                                  http://poolparty.punkt.at/demozone
             Find documents
             or images related
                                           14
             to any other text.
Find more specific information with
                     faceted search


                                                 Zero-result queries
                                                 won´t happen
                                                 anymore

facets support
structured queries




facets help
to drill down
search results,
adapt dynamically

                     15
Complex queries with faceted
search over linked data



                     „Show me all airlines
                     whose
                     parent company is
                     Lufthansa“
                     http://dbpedia.neofonie.de/




16
Find linked information – Status quo

                            I want to see facts
                              from different
                            sources describing
                                this entity.

My Energy-Dossier about




                              The user has to put
                              together manually
                              energy-related
                              information about
                              a country.




      17
360O views: Find linked information




                   Energy-related
                   information about countries
                   are „mashed“ automatically
                   by using „linked data“
                          http://www.reegle.info/countries




18
19
                                                          The role of thesauri in semantic search




     http://www.flickr.com/photos/techburst/2796421248/
SKOS – Open Standard for Thesauri

• SKOS = Simple Knowledge
  Organisation System(s)

• Goal …
  – Simple, flexible, extensible, machine-
    understandable representation for…
     •   Thesauri
     •   Classification Schemes
     •   Taxonomies
     •   Subject Headings
     •   Other types of ‘controlled vocabulary’…




     © Semantic Web Company – http://www.semantic-web.at/   20
The role of thesauri in semantic
search




  21
The role of thesauri in semantic
search (contd.)



                  Thesaurus as the central point
                  to control:

                  •   labels & query expansion
                  •   facets
                  •   refine search mechanisms
                  •   metadata integration




 22
Content annotation:
             Traditional approach



Apple is in the
process of launching
                                                  merchandise
                                      Apple
an application to
allow iPhone, iPad
and iPod Touch users                               application
to purchase Apple
                                    iPod touch
merchandise straight
from their devices.
                                                 iPad
                                    iPhone




             http://www.punkt.at/                         23
Semantic Web approach:
                Concepts, NOT simply text


Apple is in the                                                    Apple
process of launching                  http://my.com/Apple
an application to
allow iPhone, iPad                                             Apple Inc.
and iPod Touch users
to purchase Apple
merchandise straight                             http://my.com/smartphone
from their devices.
                                http://my.com/iPhone
                                                               iPhone

                                                  iPhone 3G
                       http://my.com/iPhone3G
                                                  iPhone 3GS



               http://www.punkt.at/                                 24
PoolParty Tag Suggestions

• Support of different
  formats (html, doc,
  pdf, ppt, …)

• Thesaurus based
  extraction

• Integrable with
  CMS, CRM etc.




              http://www.punkt.at/         25
Zusammenspiel CKAN und PoolParty


               Service für Tagvorschläge
                     aus Thesaurus       CKAN Norway
   CKAN UK




                                                   CKAN Netherlands
CKAN Austria


                     Andere Datenquellen

           © Semantic Web Company – http://www.semantic-web.at/       26
Zusammenspiel CKAN und PoolParty



                Indizierung der Metadaten
                                               CKAN Norway
   CKAN UK




                                                   CKAN Netherlands
CKAN Austria


                     Andere Datenquellen

           © Semantic Web Company – http://www.semantic-web.at/       27
PoolParty System Architecture



                                   Search Application

                                   Search Services

CKAN
 UK
                        Semantic
          Collector                      Index
                         Indexer
 CKAN       RDF
Austria   Cartridge




              28
Verbindung CKAN und PoolParty in
         LOD 2 Projekt

• Wo suchen?
 Zentrale Suche über verteilte Systeme
• Welche Suchbegriffe?
 Harmonisierte Metadaten durch mehrsprachige,
 semantische Tags
• Weitere Features
  – Kategorisierung
  – Autocomplete
  – Recommender für ähnliche Datenquellen



        © Semantic Web Company – http://www.semantic-web.at/   29
PoolParty Demo

• HP: http://poolparty.punkt.at/PoolParty/

• Doku: https://grips.punkt.at/display/POOLDOKU/




Latest Update:
Version 2.9.2, May 2011




         http://www.punkt.at/                  30
Danke für Ihre Aufmerksamkeit!


                      Mag. Thomas Schandl
                      t.schandl@semantic-web.at

                      Semantic Web Company
                      GmbH
                      Lerchenfelder Gürtel 43
                      A-1160 Wien

                      Tel. +43 1 402 12 35
                      office@semantic-web.at




© Semantic Web Company – http://www.semantic-web.at/   31

LOD2 CKAN WS Vienna: PoolParty für semantische Suche und Vokabular Management für CKAN, Thomas Schandl (SWC)

  • 1.
    PoolParty für semantische Sucheund Vokabular Management für CKAN Mag. Thomas Schandl Semantic Web Company
  • 2.
    Agenda • Live DemoPoolParty Semantic Search • Szenarios für semantische Suche • Die Rolle von Thesauri bei semantischer Suche • PoolParty Demo am Beispiel OpenData und CKAN 2
  • 3.
    OGD/CKAN Herausforderungen • Wosuchen? Verteilte nationale und internationale Datenbestände • Welche Suchbegriffe verwenden? Uneinheitliche Metadaten und Verschlagwortung, verschiedene Sprachen und Begrifflichkeiten • Verschiedene andere Katalogsysteme © Semantic Web Company – http://www.semantic-web.at/ 3
  • 4.
    Some thoughts onthe Semantic Web “In the Semantic Web, it is not the Semantic which is new, it is the Web which is new” Dr. Chris Welty, IBM Watson Research Center 4
  • 5.
    Some thoughts onthe Semantic Web “A little Semantics Goes a Long Way” Prof. Jim Hendler Rensselaer Polytechnic Institute 5
  • 6.
    PoolParty Überblick • Hauptanwendungsgebiete: – SKOS Thesaurus Management – Linked Data (publishing & consuming) – Semantic Search & Semantic Indexing • Verbindung CKAN und PoolParty in LOD 2 Projekt 6
  • 7.
    Semantic Search Demo http://bit.ly/semantic_search http://www.flickr.com/photos/techburst/2796421248/ Semantic search has many faces 7
  • 8.
    Weitere Semantic SearchSzenarios http://www.flickr.com/photos/techburst/2796421248/ Semantic search has many faces 8
  • 9.
    Situations in whichsemantic search can help I want to see facts from different sources describing I can´t this entity. remember how I want to know to spell the more about this search term entity in a certain context. I want to search in different languages I want to gain simultaneously background knowledge to a certain document I can´t remember I want the exactly what I software to I forgot some of was looking for understand what I the names for mean by „Jaguar“ the entity I´m looking for 9
  • 10.
    Find information faster– Auto- Complete I can´t remember how to spell the search term To provide powerful auto-complete also for enterprise search scenarios you need to establish an enterprise vocabulary. 10
  • 11.
    Reveal hidden information– Status quo I forgot some of the names for the entity I´m SNCR Search looking for SNCR OR „Selective non- Search 11
  • 12.
    Reveal hidden informationwith query expansion SNCR OR "selective non catalytic reduction" Search SNCR preferred Label alternative Label selective non catalytic reduction 12
  • 13.
    Multi-lingual search basedon a thesaurus I want to search in different languages simultaneously clean energy OR energía limpia Search preferred Label @en clean energy energía limpia preferredLabel @es 13
  • 14.
    Reveal hidden informationand relations I want to gain background knowledge to a certain document http://poolparty.punkt.at/demozone Find documents or images related 14 to any other text.
  • 15.
    Find more specificinformation with faceted search Zero-result queries won´t happen anymore facets support structured queries facets help to drill down search results, adapt dynamically 15
  • 16.
    Complex queries withfaceted search over linked data „Show me all airlines whose parent company is Lufthansa“ http://dbpedia.neofonie.de/ 16
  • 17.
    Find linked information– Status quo I want to see facts from different sources describing this entity. My Energy-Dossier about The user has to put together manually energy-related information about a country. 17
  • 18.
    360O views: Findlinked information Energy-related information about countries are „mashed“ automatically by using „linked data“ http://www.reegle.info/countries 18
  • 19.
    19 The role of thesauri in semantic search http://www.flickr.com/photos/techburst/2796421248/
  • 20.
    SKOS – OpenStandard for Thesauri • SKOS = Simple Knowledge Organisation System(s) • Goal … – Simple, flexible, extensible, machine- understandable representation for… • Thesauri • Classification Schemes • Taxonomies • Subject Headings • Other types of ‘controlled vocabulary’… © Semantic Web Company – http://www.semantic-web.at/ 20
  • 21.
    The role ofthesauri in semantic search 21
  • 22.
    The role ofthesauri in semantic search (contd.) Thesaurus as the central point to control: • labels & query expansion • facets • refine search mechanisms • metadata integration 22
  • 23.
    Content annotation: Traditional approach Apple is in the process of launching merchandise Apple an application to allow iPhone, iPad and iPod Touch users application to purchase Apple iPod touch merchandise straight from their devices. iPad iPhone http://www.punkt.at/ 23
  • 24.
    Semantic Web approach: Concepts, NOT simply text Apple is in the Apple process of launching http://my.com/Apple an application to allow iPhone, iPad Apple Inc. and iPod Touch users to purchase Apple merchandise straight http://my.com/smartphone from their devices. http://my.com/iPhone iPhone iPhone 3G http://my.com/iPhone3G iPhone 3GS http://www.punkt.at/ 24
  • 25.
    PoolParty Tag Suggestions •Support of different formats (html, doc, pdf, ppt, …) • Thesaurus based extraction • Integrable with CMS, CRM etc. http://www.punkt.at/ 25
  • 26.
    Zusammenspiel CKAN undPoolParty Service für Tagvorschläge aus Thesaurus CKAN Norway CKAN UK CKAN Netherlands CKAN Austria Andere Datenquellen © Semantic Web Company – http://www.semantic-web.at/ 26
  • 27.
    Zusammenspiel CKAN undPoolParty Indizierung der Metadaten CKAN Norway CKAN UK CKAN Netherlands CKAN Austria Andere Datenquellen © Semantic Web Company – http://www.semantic-web.at/ 27
  • 28.
    PoolParty System Architecture Search Application Search Services CKAN UK Semantic Collector Index Indexer CKAN RDF Austria Cartridge 28
  • 29.
    Verbindung CKAN undPoolParty in LOD 2 Projekt • Wo suchen? Zentrale Suche über verteilte Systeme • Welche Suchbegriffe? Harmonisierte Metadaten durch mehrsprachige, semantische Tags • Weitere Features – Kategorisierung – Autocomplete – Recommender für ähnliche Datenquellen © Semantic Web Company – http://www.semantic-web.at/ 29
  • 30.
    PoolParty Demo • HP:http://poolparty.punkt.at/PoolParty/ • Doku: https://grips.punkt.at/display/POOLDOKU/ Latest Update: Version 2.9.2, May 2011 http://www.punkt.at/ 30
  • 31.
    Danke für IhreAufmerksamkeit! Mag. Thomas Schandl t.schandl@semantic-web.at Semantic Web Company GmbH Lerchenfelder Gürtel 43 A-1160 Wien Tel. +43 1 402 12 35 office@semantic-web.at © Semantic Web Company – http://www.semantic-web.at/ 31