Things, not Strings

B
Bernhard HaslhoferResearcher at University of Vienna
Things, not Strings
ADV Tagung - Suchstrategien für heute und morgen
12. November, 2014
Dr. Bernhard Haslhofer
Data Scientist
AIT - Austrian Institute of Technology
bernhard.haslhofer@ait.ac.at
Things, not Strings
http://googleblog.blogspot.co.at/2012/05/introducing-knowledge-graph-things-not.html
Knowledge Graph?
Vorteile
4
Die richtigen “Dinge” finden
5
Zusammenfassungen
6
Beziehungen
7
“Wird auch oft gesucht”
8
Funktionsweise
9
Information Retrieval Basics
10
(Web-)Inhalte
Analyse
Repräsentation
(Index)
Retrieval
Funktion
Resultate
Suchbegriff
Analyse Repräsentation“David Alaba”
Invertierter Index
11
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Dictionary Postings
Semantischer Index
12
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Dictionary Postings Knowledge Graph
Semantischer Index
13
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Strings Things
Knowledge Graph
Konstruktion
14
Eigenschaften
• Dinge sind eindeutig identifizierbar (URIs)
• Dinge haben
• einen Typ (“Person”, “Ort”, “Ereignis”, …)
• Eigenschaften (“Name”, “Lat/Lng”, “Datum”, …)
• Beziehungen zu anderen relevanten (!!!) Dingen
15
Aggregation (offener) Daten
16
Aggregation (offener) Daten
Aggregation (offener) Daten
18
Extraktion von Dingen
19
<div itemscope itemtype="http://schema.org/SportsTeam">
<span itemprop=“name">FC Bayern München</span>
<div itemprop="member" item scope
itemtype="http://schema.org/OrganizationRole">
<div itemprop="member" itemscope
itemtype="http://schema.org/Person">
<span itemprop=“name">David Alaba</span>
</div>
<span itemprop="startDate">2010</span>
<span itemprop=“namedPosition">Linker Verteidiger</span>
</div>
Interaktive Eingabe
20
Knowledge Graph
Verlinkung
21
d2
d6
Schritte / Probleme
• Named Entity Detection: “…EM-Qualifikation gegen
Russland: So geht Marcel Koller mit dem David Alaba-
Ausfall um…”
• Named Entity Disambiguation: “…Aufregendes Derby
lässt die Austria aufatmen…”

(Austria = Fußballverein/Land)?
• Named Entity Linkage/Resolution:
• David Alaba = http://dbpedia.org/resource/David_Alaba
• Austria = http://www.freebase.com/m/03mp37
22
Tools
• AlchemyAPI (http://www.alchemyapi.com/):
• identifiziert eine Vielzahl von Entitätstypen (Personen,
Orte, Ereignisse, etc.) in Dokumenten
• unterstützt DBPedia, Freebase
• DBPedia Spotlight (https://github.com/dbpedia-spotlight):
• annotiert DBPedia Entitäten in Dokumenten
• ….
23
Fazit
24
• Heutige und zukünftige Suchstrategien basieren
auf Volltextsuche + Knowledge Graph
• Google Knowledge Graph
• Microsoft Bing Satori Knowledge Base
• …
25
• Identifikation, Extraktion und Verlinkung von Dingen
“Things” gewinnt zunehmend an Bedeutung
• Verfügbarkeit offener, strukturierter Daten ist
essentiell zum Aufbau von Knowledge Graphs
26
Ausblick
27
• Knowledge Base/Graph
• ist Voraussetzung für Question-Answering Systeme (z.b., IBM
Watson)
• bildet Basis für natürlichsprachige Suche
• ermöglicht Antizipation zukünftiger Suchanfragen

28
“OK Bernhard…”
29
http://bernhardhaslhofer.info
http://slideshare.net/bhaslhofer
bernhard.haslhofer@ait.ac.at
@bhaslhofer
1 of 29

Recommended

The Story behind Maphub by
The Story behind MaphubThe Story behind Maphub
The Story behind MaphubBernhard Haslhofer
1.6K views21 slides
UN’ESPERIENZA DI RAPPRESENTAZIONE DI DATI DI CATALOGHI DIGITALI IN LINKED OPE... by
UN’ESPERIENZA DI RAPPRESENTAZIONE DI DATI DI CATALOGHI DIGITALI IN LINKED OPE...UN’ESPERIENZA DI RAPPRESENTAZIONE DI DATI DI CATALOGHI DIGITALI IN LINKED OPE...
UN’ESPERIENZA DI RAPPRESENTAZIONE DI DATI DI CATALOGHI DIGITALI IN LINKED OPE...Ciro Mattia Gonano
593 views10 slides
ARIADNE: Final report on standards and project registry by
ARIADNE: Final report on standards and project registryARIADNE: Final report on standards and project registry
ARIADNE: Final report on standards and project registryariadnenetwork
622 views132 slides
DH101 2013/2014 course 8 - Historical Geographical Information Systems (HGIS)... by
DH101 2013/2014 course 8 - Historical Geographical Information Systems (HGIS)...DH101 2013/2014 course 8 - Historical Geographical Information Systems (HGIS)...
DH101 2013/2014 course 8 - Historical Geographical Information Systems (HGIS)...Frederic Kaplan
2.3K views36 slides
Using SKOS Vocabularies for Improving Web Search by
Using SKOS Vocabularies for Improving Web SearchUsing SKOS Vocabularies for Improving Web Search
Using SKOS Vocabularies for Improving Web SearchBernhard Haslhofer
2.4K views27 slides
DH101 2013/2014 course 3 - Panoramic intensifcation, narrative crise and intr... by
DH101 2013/2014 course 3 - Panoramic intensifcation, narrative crise and intr...DH101 2013/2014 course 3 - Panoramic intensifcation, narrative crise and intr...
DH101 2013/2014 course 3 - Panoramic intensifcation, narrative crise and intr...Frederic Kaplan
1.8K views100 slides

More Related Content

Viewers also liked

DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRM by
DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRMDH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRM
DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRMFrederic Kaplan
2.3K views75 slides
DH101 2013/2014 course 2 by
DH101 2013/2014 course 2DH101 2013/2014 course 2
DH101 2013/2014 course 2Frederic Kaplan
1.9K views39 slides
GraphSense - Real-time Insight into Virtual Currency Ecosystems by
GraphSense - Real-time Insight into Virtual Currency EcosystemsGraphSense - Real-time Insight into Virtual Currency Ecosystems
GraphSense - Real-time Insight into Virtual Currency EcosystemsBernhard Haslhofer
1.1K views6 slides
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library Information by
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library InformationCIDOC CRM+FRBRoo: an Integrated View of Museum and Library Information
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library InformationPatrick Le Boeuf
1.3K views26 slides
Types and Annotations for CIDOC CRM Properties - Presentation by
Types and Annotations for CIDOC CRM Properties - PresentationTypes and Annotations for CIDOC CRM Properties - Presentation
Types and Annotations for CIDOC CRM Properties - PresentationVladimir Alexiev, PhD, PMP
2.2K views37 slides
Interopérabilité de l'information bibliographique et muséologique by
Interopérabilité de l'information bibliographique et muséologiqueInteropérabilité de l'information bibliographique et muséologique
Interopérabilité de l'information bibliographique et muséologiquePatrick Le Boeuf
1.1K views37 slides

Viewers also liked(20)

DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRM by Frederic Kaplan
DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRMDH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRM
DH101 2013/2014 course 6 - Semantic coding, RDF, CIDOC-CRM
Frederic Kaplan2.3K views
GraphSense - Real-time Insight into Virtual Currency Ecosystems by Bernhard Haslhofer
GraphSense - Real-time Insight into Virtual Currency EcosystemsGraphSense - Real-time Insight into Virtual Currency Ecosystems
GraphSense - Real-time Insight into Virtual Currency Ecosystems
Bernhard Haslhofer1.1K views
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library Information by Patrick Le Boeuf
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library InformationCIDOC CRM+FRBRoo: an Integrated View of Museum and Library Information
CIDOC CRM+FRBRoo: an Integrated View of Museum and Library Information
Patrick Le Boeuf1.3K views
Interopérabilité de l'information bibliographique et muséologique by Patrick Le Boeuf
Interopérabilité de l'information bibliographique et muséologiqueInteropérabilité de l'information bibliographique et muséologique
Interopérabilité de l'information bibliographique et muséologique
Patrick Le Boeuf1.1K views
DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated repre... by Frederic Kaplan
DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated repre...DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated repre...
DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated repre...
Frederic Kaplan3.2K views
DH101 2013/2014 course1 - Presentation of the course / Collaborative writing ... by Frederic Kaplan
DH101 2013/2014 course1 - Presentation of the course / Collaborative writing ...DH101 2013/2014 course1 - Presentation of the course / Collaborative writing ...
DH101 2013/2014 course1 - Presentation of the course / Collaborative writing ...
Frederic Kaplan2.5K views
DH101 2013/2014 course 9 - Crowdsourcing, crowdfunding, Wikipedia, Open Stree... by Frederic Kaplan
DH101 2013/2014 course 9 - Crowdsourcing, crowdfunding, Wikipedia, Open Stree...DH101 2013/2014 course 9 - Crowdsourcing, crowdfunding, Wikipedia, Open Stree...
DH101 2013/2014 course 9 - Crowdsourcing, crowdfunding, Wikipedia, Open Stree...
Frederic Kaplan3.1K views
DH101 2013/2014 course 4 - Digitization techniques 2D and 3D by Frederic Kaplan
DH101 2013/2014 course 4 - Digitization techniques 2D and 3D DH101 2013/2014 course 4 - Digitization techniques 2D and 3D
DH101 2013/2014 course 4 - Digitization techniques 2D and 3D
Frederic Kaplan2K views
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni... by Frederic Kaplan
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
Frederic Kaplan3.3K views
Lidar for heritage mapping in India by Archana Joshi
Lidar for heritage mapping in IndiaLidar for heritage mapping in India
Lidar for heritage mapping in India
Archana Joshi774 views
Cultural Mapping & Digital Storytelling in a Social Context by Stefan Kolgen
Cultural Mapping & Digital Storytelling in a Social ContextCultural Mapping & Digital Storytelling in a Social Context
Cultural Mapping & Digital Storytelling in a Social Context
Stefan Kolgen761 views
Mapping Cultural Heritage Information to CIDOC-CRM by Maria Theodoridou
Mapping Cultural Heritage Information to CIDOC-CRMMapping Cultural Heritage Information to CIDOC-CRM
Mapping Cultural Heritage Information to CIDOC-CRM
Maria Theodoridou2.5K views
Achille Felicetti - ARIADNE Semantic Integration of Archaeological Information by ariadnenetwork
Achille Felicetti - ARIADNE Semantic Integration of Archaeological InformationAchille Felicetti - ARIADNE Semantic Integration of Archaeological Information
Achille Felicetti - ARIADNE Semantic Integration of Archaeological Information
ariadnenetwork367 views
Cultural Mapping Dumaguete City by Monte Christo
Cultural Mapping Dumaguete CityCultural Mapping Dumaguete City
Cultural Mapping Dumaguete City
Monte Christo7.5K views
Big Data & Text Mining by Michel Bruley
Big Data & Text MiningBig Data & Text Mining
Big Data & Text Mining
Michel Bruley20.6K views
The Science of Memorable Presentations by Ethos3
The Science of Memorable PresentationsThe Science of Memorable Presentations
The Science of Memorable Presentations
Ethos3347.7K views

More from Bernhard Haslhofer

Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P... by
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Bernhard Haslhofer
558 views20 slides
Token Systems, Payment Channels, and Corporate Currencies by
Token Systems, Payment Channels, and Corporate CurrenciesToken Systems, Payment Channels, and Corporate Currencies
Token Systems, Payment Channels, and Corporate CurrenciesBernhard Haslhofer
438 views48 slides
Can a blockchain solve the trust problem? by
Can a blockchain solve the trust problem?Can a blockchain solve the trust problem?
Can a blockchain solve the trust problem?Bernhard Haslhofer
1.2K views23 slides
Measurements in Cryptocurrency Networks by
Measurements in Cryptocurrency NetworksMeasurements in Cryptocurrency Networks
Measurements in Cryptocurrency NetworksBernhard Haslhofer
751 views37 slides
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... by
 Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...Bernhard Haslhofer
926 views57 slides
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba... by
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Bernhard Haslhofer
422 views22 slides

More from Bernhard Haslhofer(20)

Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P... by Bernhard Haslhofer
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Bernhard Haslhofer558 views
Token Systems, Payment Channels, and Corporate Currencies by Bernhard Haslhofer
Token Systems, Payment Channels, and Corporate CurrenciesToken Systems, Payment Channels, and Corporate Currencies
Token Systems, Payment Channels, and Corporate Currencies
Bernhard Haslhofer438 views
Can a blockchain solve the trust problem? by Bernhard Haslhofer
Can a blockchain solve the trust problem?Can a blockchain solve the trust problem?
Can a blockchain solve the trust problem?
Bernhard Haslhofer1.2K views
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... by Bernhard Haslhofer
 Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Bernhard Haslhofer926 views
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba... by Bernhard Haslhofer
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Bernhard Haslhofer422 views
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics by Bernhard Haslhofer
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency AnalyticsO Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
Bernhard Haslhofer635 views
Mind the Gap - Data Science Meets Software Engineering by Bernhard Haslhofer
Mind the Gap - Data Science Meets Software EngineeringMind the Gap - Data Science Meets Software Engineering
Mind the Gap - Data Science Meets Software Engineering
Bernhard Haslhofer303 views
BITCOIN - De-anonymization and Money Laundering Detection Strategies by Bernhard Haslhofer
BITCOIN - De-anonymization and Money Laundering Detection StrategiesBITCOIN - De-anonymization and Money Laundering Detection Strategies
BITCOIN - De-anonymization and Money Laundering Detection Strategies
Bernhard Haslhofer2.7K views
Bitcoin - Introduction, Technical Aspects and Ongoing Developments by Bernhard Haslhofer
Bitcoin - Introduction, Technical Aspects and Ongoing DevelopmentsBitcoin - Introduction, Technical Aspects and Ongoing Developments
Bitcoin - Introduction, Technical Aspects and Ongoing Developments
Bernhard Haslhofer3.9K views
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen... by Bernhard Haslhofer
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
Bernhard Haslhofer1.6K views
The value of open data and the OpenGLAM network by Bernhard Haslhofer
The value of open data and the OpenGLAM networkThe value of open data and the OpenGLAM network
The value of open data and the OpenGLAM network
Bernhard Haslhofer835 views
Offene Daten im Kulturbereich - Die pragmatische Perspektive by Bernhard Haslhofer
Offene Daten im Kulturbereich - Die pragmatische PerspektiveOffene Daten im Kulturbereich - Die pragmatische Perspektive
Offene Daten im Kulturbereich - Die pragmatische Perspektive
Bernhard Haslhofer1.1K views
Semantic Tagging for old maps...and other things on the Web by Bernhard Haslhofer
Semantic Tagging for old maps...and other things on the WebSemantic Tagging for old maps...and other things on the Web
Semantic Tagging for old maps...and other things on the Web
Bernhard Haslhofer872 views
ResourceSync: Leveraging Sitemaps for Resource Synchronization by Bernhard Haslhofer
ResourceSync: Leveraging Sitemaps for Resource SynchronizationResourceSync: Leveraging Sitemaps for Resource Synchronization
ResourceSync: Leveraging Sitemaps for Resource Synchronization

Things, not Strings