• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
(13) Semantic Web Technologies - Linked Data & Semantic Search
 

(13) Semantic Web Technologies - Linked Data & Semantic Search

on

  • 2,727 views

 

Statistics

Views

Total Views
2,727
Views on SlideShare
2,661
Embed Views
66

Actions

Likes
4
Downloads
88
Comments
0

1 Embed 66

http://www.linkedin.com 66

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    (13) Semantic Web Technologies - Linked Data & Semantic Search (13) Semantic Web Technologies - Linked Data & Semantic Search Presentation Transcript

    • Semantic Web Technologies Lecture Dr. Harald Sack Hasso-Plattner-Institut für IT Systems Engineering University of Potsdam Winter Semester 2012/13 Lecture Blog: http://semweb2013.blogspot.com/ This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)Dienstag, 22. Januar 13
    • Semantic Web Technologies Content2 1. Introduction 2. Semantic Web - Basic Architecture Languages of the Semantic Web - Part 1 3. Knowledge Representation and Logics Languages of the Semantic Web - Part 2 4. Applications in the ,Web of Data‘ Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 3 a l i c o g n g t o l r i O n e e i n n g E Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • t a D na s d io e t k a n c i4 L li c h p p & ear A S t i c a n S e m Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Web Technologies Content 4. Applications in the Web of Data 4.1. Ontological Engineering 4.2. Linked Data Engineering 4.3. Semantic Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How do we get Data from the Web...? 4.1 Linked Data Engineering 4.1.1 APIs vs. Linked Data 4.1.2 Linked Data Principles 4.1.3 Linked Data @ Work Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13 Turmbau zu Babel, Pieter Brueghel, 1563
    • How to get Data from the Web? •Data can only be found on the Web, if it is available at some7 website HTTP HTML JDBC Browser Web-Server Database Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How to get Data from the Web? •There is a anumber of different (proprietary) Web APIs, data8 exchange formats and Mashups on top of that Mashup Web Web Web Web API 1 API 2 API 3 API 4 Database 1 Database 2 Database 3 Database 4 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • In the Web today...9 • Data is locked up in small data islands • Other applications usually cannot acces this data... Database Database Database Database Database Database Database Database Database Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Problems ahead....10 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam http://www.w3.org/2009/Talks/0204-ted-tbl/#(22)Dienstag, 22. Januar 13
    • But there is a solution: •...open up proprietary data islands11 •...publish all data that are of public interest •...in a way that •other applications can access, utilize, and process this data,and •all applications can access additional (meta)data for the available data Database 1 Database 2 Database 3 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • But there is a solution: •Apply semantic technologies:12 •to publish structured data on the web •to draw connections from one data source to data from other data sources RDF Data RDF Data RDF Data RDF Data RDF Links RDF Links RDF Links Database 1 Database 2 Database 3 Database 4 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 13 4.1 Linked Data Engineering 4.1.1 APIs vs. Linked Data 4.1.2 Linked Data Principles 4.1.3 Linked Data @ Work Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data and the ‘Web of Data‘14 ■ Term refers to an idea originally from Tim Berners-Lee (Tim Berners-Lee, Linked Data, 2006, http://www.w3.org/DesignIssues/LinkedData.html) □ Set of best practices for publication and linking of structured data on the web □ Basic assumption: The value of data on the web increases when they are connected to other data sources The Web of data is about a data and naming model on the Web M.Hausenblas, Quick Linked Data Introduction, Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam http://www.slideshare.net/mediasemanticweb/quick-linked-data-introductionDienstag, 22. Januar 13
    • 15 Linked Data Principles (1) Use URIs as names for things. (2) Use HTTP URIs, so that people can look up those names. (3) When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) (4) Include links to other URIs, so that they can discover more things. Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Principles (1) Use URIs as names for things.16 • URIs do not only identify documents but also arbitrary objects of the real world as well as abstract concepts http://semweb2013.blogspot.com http://dbpedia.org/resource/Albert_Einstein http://musicbrainz.org/artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Principles17 (2) Use HTTP URIs, so that people can look up those names. • HTTP URIs (URLs) as globally unique names enable dereferencing of assiciated information in the Web • via http Content Negotiation • 303 URIs http Response Code 303 ,See Other‘ (redirect) • Hash URIs http://example.com/Harald#me Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data for Humans and Computers ■ URI should deliver information as well as for humans as for18 computers, i.e. (Thing) URI Accept: Accept: text/html application/rdf+xml (RDF data) (HTML page) Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data for Humans and Computers ■ Server delivers different HTTP responses dependent of19 HTTP-Accept-Header (Content Negotiation) http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data for Humans and Computers ■ URI should deliver information as well as for humans as for20 computers, i.e. (Thing) http://dbpedia.org/resource/Ernest_Hemingway Accept: Accept: text/html application/rdf+xml http://dbpedia.org/data/ http://dbpedia.org/page/ Ernest_Hemingway.rdf Ernest_Hemingway (RDF data) (HTML page) Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Principles21 (3) When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • RDF as universal data model for publishing structured data on the Web • Make all URIs in the RDF graph dereferencable • Avoid RDF constructs that cause problems in Linked Data context • RDF Reification • RDF Collections und Containers • unnamed Blank Nodes Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Principles22 (4) Include links to other URIs, so that they can discover more things. • Link RDF references among data between different data sources, to find information related by content • Relationship Links Links to external LOD Entitites related with the original entity • Identity Links Links to external LOD Entities referring to the same object or concept • Vocabulary Links Links to definitions of the original entity Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • The application of the Linked Data Principles leads to a ,Web of Data‘23 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘24 May 2007 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘25 Nov 2007 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘26 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘27 July 2009 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘28 September 2010 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Development of the ,Web of Data‘ 300 Datasets 31B RDF Triples 504M Links29 September 2011 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Mashups □ Semantic Mashups are applications that use linked RDF data30 from various data sources □ in difference to interfaces and exchange formats or ordinary Web APIs, Linked Data offers the following benefits: □ a flexible and standardized data format (RDF) □ standardized access mechanism (http) □ possibility to put links (RDF-Links) among different data sources » enables navigation » is supported by search engines (Crawler) » enables expressive search facilities over the crawled data and beyond S. Auer, J. Lehmann, Ch. Bizer: Semantitsche Mashups auf Basis vernetzter Daten, in T. Pellegrini, A. Blumauer (Hrsg.): Social Semantic Web, Springer, 2009. Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Sources in the Web □ Native publication31 □ D2R-Server, OpenLink Virtuoso, Pubby, etc. □ Implementation of Wrappers around existing applications / APIs □ SIOC Exporter for Wordpress, Drupal, phpBB,... □ RDF Book Mashup (Amazon API, Google Base-API,...) □ Linking Open Data Project □ Semantic Web Education and Outreach W3C working group □ Catalogue of all known sources of linked data with an open source license » DBPedia, Flickr, Open-Cyc, FOAF, SIOC, GeoNames, ... Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Browser for Linked Data ■ Differences to arbitrary RDF-Browsers32 □ RDF Data to be visualized does not necessarely reside in local repository, but is distributed in the Web □ requires dynamic reload of RDF resources ■ Tabulator (Tim Berners-Lee, MIT-) (T. Berners-Lee et al.: Tabulator: Exploring and analyzing linked data on the semantic web, in Proc. 3rd Int. Semantic Web User Interaction Workshop, 2006, http:// swui.semanticweb.org/swui06/papers/Berners-Lee/Berners-Lee.pdf) ■ OpenLink RDF Data Explorer □ enables visualization as graph, timeline, map, etc. http://ode.openlinksw.com/ ■ Zitgist Browser http://browser.zitgist.com/ ■ DISCO Browser http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/disco/ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Search Engines for Linked Data ■ Crawler-based, follow links in datasets to create an index that can be queried33 ■ Swoogle □ keyword-based full text searcg (Apache-Lucene), uses only limited semantic annotation http://swoogle.umbc.edu/ ■ Semantic Web Search Engine (SWSE) □ additionally uses rdf:type properties as search filter http://swse.deri.org/ ■ Sindice http://www.sindice.com/ ■ Falcons □ with data browser for result analysis http://iws.seu.edu.cn/services/falcons/ ■ Sig.ma - Semantic Information Mashup (based on Sindice) http://sig.ma/ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • http://dbpedia.neofonie.com34 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Open Data ■ public Linked Data Resourcen in the Web, licensed as35 „Creative Common CC-BY“ ■ 5-Star Criteria for Linked Open Data ★ Available on the web (whatever format) but with an open licence, to be Open Data ★ ★ Available as machine-readable structured data (e.g. excel instead of image scan of a table) ★ ★ ★ as (2) plus non-proprietary format (e.g. CSV instead of excel) ★ ★ ★ ★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff ★ ★ ★ ★ ★ All the above, plus: Link your data to other people’s data to provide context Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Open Data36 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 4.1 Linked Data Engineering 4.1.1 APIs vs. Linked Data37 4.1.2 Linked Data Principles 4.1.3 Linked Data @ Work Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data □ ordered by categories38 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data User Generated Content Media39 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Publications40 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data41 Government Geographic Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 42 Cross-Domain Life Sciences Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam Linked DataDienstag, 22. Januar 13
    • Linking Open Data ■ Some statistics (as of 09/2011)43 distribution of RDF Triples by domain Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linking Open Data ■ Some statistics (as of 09/2011)44 distribution of Links by domain Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Ontologien □ Ontologies hold the Linked Data Cloud together45 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Ontologien □ z.B. OWL □ owl:sameAs connects identical individuals46 □ owl:equivalentClass connects equivalent classes Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Ontologien □ z.B. umbel (version 1.0, Feb. 2011) □ „Upper Mapping and Binding47 Exchange Layer“ □ Subset of OpenCyc as RDF Triples based on SKOS and OWL2 □ Upper Ontology with 28.000 concepts (skos:Concept) □ 46.000 Mappings into DBpedia, geonames e.a. (owl:equivalentClass, rdfs:subClassOf) □ Links to more than 2 Mio Wikipedia pages Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data Ontologien □ z.B. SKOS □ „Simple Knowledge Organization System“48 □ based on RDF and RDFS □ applied for definitions and mappings of vocabularies and ontologies □ skos:Concept (clsses) □ skos:narrower □ skos:broader □ skos:related □ skos:exactMatch, skos:narrowMatch, skos:broadMatch, skos:relatedMatch Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Linked Data (Research) Applications □ WhoKnows49 http://apps.facebook.com/whoknows_/ □ RISQ! http://141.89.225.43/whoknowsmovies/game.html □ for Data Cleansing □ for relevance ranking of facts □ for entity summarization Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 4.1 Linked Data Engineering 4.1.1 APIs vs. Linked Data50 4.1.2 Linked Data Principles 4.1.3 Linked Data @ Work Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Web Technologies Content 4. Applications in the Web of Data 4.1. Ontological Engineering 4.2. Linked Data Engineering 4.3. Semantic Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 52 t i c a n S e m c h a r S eAlbrecht Dürer: Melancholia I, 1514 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 53 4.3 Semantic Search 4.3.1 Information Retrieval 4.3.3 Semantic Analysis and Retrieval 4.3.4 Exploratory Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 54 The ,Google Dilemma‘ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 55 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Classical Information Retrieval Information requests files of records56 Set of Queries Set of Documents similarity Query indexing Formulation indexing language (nach Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983) Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Classical Information Retrieval (simplified version) Set of documents57 „search“ ? searching, vb. , in allen ger n sprachen bezeugt: got.sokjan, ags. sēcan, as. sokian, an. Soekj search term(s) keywords [Bd. 20, Sp. 835] sēza, ahd. suohhan. aus idg. sprachen steht am nächsten lat. sāgiospüre, air. saigim gehe search query einer sache nach, suche; zur weiteren verwandtschaft vgl. Walde-Pokorny 2, 449. der umlaut des stammvokals erscheint im nd., er wird im md. verzeichnet vonCrecelius oberhess. wb. 827; Spiess henneb. id. 248; Hertel Thüringen240; Gerbet Vogtland 425 und auf kolonialem boden bei Schröerdeutsche mundarten des ungrischen berglandes 225. neben eigentlichem suchen einer sache nachspüren, sich bemühen, sie aufzufinden (dann auch jemanden aufsuchen, ihn bedrohen, angreifen) steht search index eine reich bezeugte bedeutungsgruppe mehr Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Evaluation of Information Retrieval Systems58 relevant documents that have been retrieved |R∩P| Recall= |R| |R∩P| Precision= P |P| R (1+α)⋅(Recall ⋅ Precision ) Fα= α⋅(Recall + Precision ) relevant documents retrieved documents Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Search Engines in the Web59 • The World Wide Web is a distributed hypermedia system with • multimedia documents and • linked via hyperlinks Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Web-Crawler (Web Robot)60 HTTP Request WWW-Server 2 4 http://www.xxxx.de/1234... http://www.xxxx.de/2234... http://www.xxxx.de/3234... http://www.xxxx.de/4234... <a href=“...“ .../> http://www.xxxx.de/5234... 1 http://www.xxxx.de/6234... http://www.xxxx.de/7234... <a href=“...“ .../> ... WWW server delivers requested HTML documents to the web 3 crawler HTML URL list documents Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Search Engines in the WWW Preprocessing and Indexing61 Data Normalization Tokenization Speech Identification Data Analysis and creation of Word Stemming index data structures POS-Tagging Descriptor Generation Web Crawler Document Preprocessing Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Search Engines in the WWW Efficient Index Data Structures Ananas62 DocID Pos Frequency Weight D123 1;13;77;132 4 9.4 D456 22;38 2 6.7 Aachen … … … … D998 15 1 1.2 Altavista Ananas Inverted File … … Zustand Location List D123 Zypern Frequency URL <H1> … <H6> <title> … text 4 1 1 0 1 … 1 Index D123 http://producers.ananas.org/index.htm <html> <head><title=“Ananas around the World“> </head> <body> … </body> </html> Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam FileDienstag, 22. Januar 13
    • Search Engines in theWWW Relevance Ranking63 • Link Popularity (Google PageRank) Start Iteration of the PageRank computation resulting PageRank A B A B Nr. PR(A) PR(B) PR(C) PR(D) 1.0 1.0 1 1,0 1,0 1,0 1,0 1.49 0,78 2 1,0 0,575 2,275 0,15 3 2,083 0,575 1,191 0,15 2 … … … … … 1.0 1.0 1.57 0,15 n 1,49 0,7833 1,577 0,15 C D C D Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 64 The Web is big. Really big. You just wont believe how vastly, hugely, mind-bogglingly big it is. (...according to Douglas Adams) Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 65 fa llac ies... has its Lang uage Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 66 icular, age in part the l angu don‘t know i f we Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 24 24 24 24 24 24 24 4 24 24 24 267 4.3 Semantic Search 4.3.1 Information Retrieval 4.3.3 Semantic Analysis and Retrieval 4.3.4 Exploratory Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Search Definition (first try)68 • Annotation of (text-based) metadata with semantic entities 2 4Entity-based Information Retrieval • 24 424 2 • Make use of semantic relations, as e.g. content-based similarities of relationships • Interoperable metadata via semantic annotations • for content-based description • for structural / technical description (Multimedia Ontologies) Overall Goal: Quantitative and qualitative improvement of Information Retrieval Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Metadata Multimedia Ontologies • MPEG-7 has been re-engineered to become an OWL-DL69 ontology (2007: Arndt et al., COMM model) 24 24 424 2 • Localize a region → Draw a bounding box • Annotate the content → Interpret the content → Tag ,Astronaut‘ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Metadata Multimedia Ontologien Example: Tagging with an MPEG-7 Ontology70 24 24 424 2 Reg1 mpeg7:StillRegion rdf:type decom position Reg1 mpeg7 :spatial_ mpeg7:image mpeg7:SpatialMask mpeg7:depicts mpeg7:depicts mpeg7:polygon dbpedia:Astronaut mpeg7:Coords Man on the Moon Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition71 24 24 424 Neil Armstrong 2 Entities is a is a Classes Astronaut Person Named Entity Recognition „locating and classifying atomic elements...into subClassOf predefined categories such as names, persons, organizations, locations, expressions of time, quantities, monetary values, etc.“ Science Occupation C.J.Rijsbergen, Information Retrieval (1979) subClassOf Employment Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition72 Neil Armstrong is a is a Astronaut Person subClassOf Science Occupation subClassOf Employment Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition „Armstrong was the first man on the Moon.“ Text73 24 24 424 Entity Mapping Neil Armstrong 2 is a is a Astronaut Person subClassOf Science Occupation subClassOf Employment Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition Text74 rdfs:label Neil Armstrong Neil Armstrong is a is a rdf:type dbpedia-owl:Astronaut Astronaut Person subClassOf rdf:type foaf:Person Science Occupation subClassOf Employment Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition Text75 24 „Armstrong was the first man on the Moon.“ Text 24 424 2 Entity Mapping http://dbpedia.org/resource/Neil_Armstrong How do I fi nd the rig ht entity? Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition How do I fi nd the rig ht entity?76 „Armstrong was the first man on the Moon.“ Text 24 24 424 2 In natural language text • nouns correspond to semantic concepts / entities • verbs correspond to semantic relations Identify nouns in natural language text: • determination of language • Part-of-Speech Tagger • Word Stemming • e.g. with http://gate.ac.uk/ Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How do I fi Named Entity Recognition nd the rig entity? ht Text77 „Armstrong was the first man on the Moon.“ Text 24 24 424 2 Determine possible Entity Mapping Candidates Anton Armstrong Armstrong, Ontario Armstrong Tools Ian Armstrong Armstrong, Florida Armstrong (car) Edward Armstrong How do I Armstrong (moon crater) Armstrong County, Texas find the r Gary Armstrong The Armstrongs igh George Armstrong t Armstrong Tunnel entity? Louis Armstrong The Armstrong Twins Craig Armstrong + 400 more... Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Context Dimensions for Audiovisual Media Text78 24 Spatial Temporal 24 424 Context Context 2 Provenance User Context Context Context provides information for Structural Context • Disambiguation • Reliability • Trustworthiness Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How do I fi Named Entity Recognition nd the rig entity? ht Text • We have to examine the Context to understand the79 semantics 24 24 424 2 „Armstrong was the first man on the Moon.“ Text Determine Named Entities (nouns) from text Armstrong man Create all possible Sets of Mapping Candidates moon Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How do I fi Named Entity Recognition nd the rig entity? ht Text „Armstrong was the first man on the Moon.“ Text80 24 24 424 2 Create all possible sets of Mapping Candidates Armstrong Man Moon George Armstrong Custer Human Neil Armstrong Bill Man Moon The Armstrong Twins Bob Man Der Moon (Oper) Armstrong, Florida Craig Armstrong David Man Moon Nickel Company Brunner Moon Armstrong, Ontario Homer Man Alfred Moon Armstrong (Moonkrater) Bernard Moon Sir Thomas Armstrong Louise Man Chava Moon Armstrong Gun Peter Moon Henry Moon Armstrong‘s Theorem Man (album) Halber Man Louis Armstrong Julian Moon Louis Armstrong International Airport Man ärgere Dich nicht Armstrong County, Texass Man Computer Ludwig Moon Robert Moon Joe Armstrong Peter van Man Ian Armstrong Violet Moon Daniel Man Moon Technologies Armstrong Tunnel Armstrong Automobile Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition (1) Co-occurence Analysis81 (2) Semantic Analysis 24 (3) Machine Learning 24 424 2 Armstrong man moon ? ? Armstrong, Florida man (Album) Moon Technologies ‣ For all possible Combinations do: ‣ Determine the probability of the co-occurence of a term combination in an arbitrary text document corpus, as e.g. in the wikipedia ‣ Select the entity combination with the maximum probability of co-occurence Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Named Entity Recognition (1) Co-occurence Analysis82 (2) Semantic Analysis 24 (3) Machine Learning 24 424 2 Armstrong man moon George Armstrong Custer Human moon (planet) Neil Armstrong Bob Man The Moon (Opera) Louis Armstrong Craig Armstrong David Man Moon Nickel Company Armstrong, Florida Brunner Moon Armstrong, Ontario Homer Man Alfred Moon Bernard Armstrong (Moonkrater) Louise Man Chava Moon Peter Moon Henry Moon Man (album) Half Man Armstrong Gun Julian Moon Dead Man Walking Sir Thomas Armstrong Man Machine Ludwig Moon Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • How to use semantic data in Retrieval? Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13 Turmbau zu Babel, Pieter Brueghel, 1563
    • Semantic metadata enable an improvement of traditional keyword-based retrieval by (1) Query String Refinement enables more precise or more complete search results (2) Cross Referencing enables to complement search results with additional associated or similar information (3) Exploratory Search enables visualization and navigation of the search space (4) Reasoning enables to complement search results with implicitly given information Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13 Turmbau zu Babel, Pieter Brueghel, 1563
    • 85 4.3 Semantic Search 4.3.1 Information Retrieval 4.3.3 Semantic Analysis and Retrieval 4.3.4 Exploratory Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Searching is not86 always just searching Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • I‘m looking for the book „Brave New World“ by Aldous Huxley in the first German edition...87 Brave Ne - The Al w World. (Hamburg batros C - Aldous H U X ontinent L E Y. 257 S. 8 usw., Albatros al Library, 47 “ Verlag, 1933) II 1, 25 06, 3454 8 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 88 I really liked „Brave New World“ by Aldous Huxley but how should I find what to read next...? Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 89 Exploratory Search • What, if the user does not know, which query string to use? • What, if the user is looking for complex answers ? • What, if the user does not know the domain he/she is looking for? • What, if the user wants to know all(!) about a specific topic? • ...,Browsing‘ instead of ,Searching‘ • ...to find something by chance, i.e. Serendipity • ...to get an overview • ...enable content based navigation Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Enable Exploratory Search based on Linked Open Data90 http://dbpedia.org/page/Brave_New_World Gather knowledge about dbpedia:Brave_New_World and decide, which interesting fact to follow.... Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 91 or uth :a wl r -o ho ut dia l :a w pe ia-o ed db p db dbpedia-owl:author dbpedia-owl:author dbpedia:Aldous_Huxley dbpedia:Brave_New_World Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • dbpedia:H._G._Wells92 dbpedia:George_Orwell es nc lue inf y/ log to s ce on en flu ia: in y/ ed g p olo nt db ia:o p ed db dbpedia-owl:author dbpedia:ontology/influences dbpedia:Aldous_Huxleydbpedia:Brave_New_World dbpedia:Michel_Houellebecq Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • dbpedia:H._G._Wells dbpedia:George_Orwell dbpedia:Michel_Houellebecq93 dbpedia-owl:notableWork dbpedia-owl:notableWork dbpedia-owl:notableWorkdbpedia:The_Time_Machine dbpedia:Nineteen_Eighty-Four dbpedia:Les_Particules_élémentaires Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • ...and now please surprise me.....SERENDIPITY94 Yago:EnglishExpatriatesInTheUnitedStates dbpedia:Tim_Berners-Lee dbpedia-owl:author rdf:type rdf:type rdf:type dbpedia-owl:starring dbpprop:inventordbpedia:Aldous_Huxley dbpedia:Patrick_Stewart dbpedia:World_Wide_Web dbpedia:Star_Trek:_The_Next_Generation Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Explorative Search dbpedia:Michael_Collins95 dbpedia-owl:mission dbpedia:Apollo_11 dbpedia-owl:mission dcterms:subject dbpedia-owl:mission dbpedia:Neil_Armstrong dbpedia:Buzz_Collins dcterms:subject category:Apollo_program dbpedia:Apollo_13 rdf:type dbpedia:Space_Shuttle_Challenger yago:Space_accidents_and_incidents rdf:type Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Exploratory Search and Serendipity • Find something that you were not looking for on purpose ... dbpedia:Buzz_Collins dbpedia:Cookie_Monster dbpedia:Strictly_Come_DancingDienstag, 22. Januar 13
    • Exploratory Search with yovisto97 http://mediaglobe.yovisto.com:8080/ Waitelonis, Sack: Augmenting Video Search with Linked Open Data, Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam in Proc. I-Semantics , Graz 2009.Dienstag, 22. Januar 13
    • 98http://mediaglobe.yovisto.com:8080/mggui/#start Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 99 4.3 Semantic Search 4.3.1 Information Retrieval 4.3.3 Semantic Analysis and Retrieval 4.3.4 Exploratory Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • Semantic Web Technologies Content 4. Applications in the Web of Data 4.1. Ontological Engineering 4.2. Linked Data Engineering 4.3. Semantic Search Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 4. Semantic Web Anwendungen 4.2 Linked Data Engineering 4.3 Semantic Search101 Literature • T. Heath, Ch. Bitzer Linked Data - Evolving the Web into a Global Data Space, Morgan & Claypool, 2011. Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamDienstag, 22. Januar 13
    • 4. Semantic Web Anwendungen 4.2 Linked Data Engineering 4.3 Semantic Search102 □Blog http://semweb2013.blogspot.com/ □Webseite http://www.hpi.uni-potsdam.de/studium/ lehrangebot/itse/veranstaltung/ semantic_web_technologien-3.html □bibsonomy - Bookmarks http://www.bibsonomy.org/user/lysander07/ swt1213_13 Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam Dienstag, 22. Januar 13