SlideShare a Scribd company logo

From text to entities: Information Extraction in the Era of Knowledge Graphs

GraphRM
GraphRM

Incontro del 23/07/2018 In recent years there has been a proliferation of free and commercial "knowledge graphs" (KGs), which represent real-world entities together with their semantic relationships in a graphical form. Those are becoming a powerful asset both for tech giants, with Google Knowledge Graph, IBM’s Watson QA system and Facebook’s Open Graph, as well as for startups that are developing AI products, such as, semantic search, data analytics, recommender systems. While KGs provide a structured access to a large amount of knowledge, a vast majority of the information available on the Web is still inaccessible because encoded only in the form of natural-language text. The talk will present an overview of public available KGs and the main techniques used to bridge unstructured text with them, enabling a wide variety of knowledge-based applications. Speaker: Matteo Cannaviccio

1 of 55
Download to read offline
From Information to Knowledge:
an overview of
public available Knowledge Graphs
Matteo Cannaviccio
mcannaviccio@gmail.com
GraphRM
LUISS Enlabs
23/07/2018
Matteo Cannaviccio
Ph.D. (2018) @ Roma Tre University
supervisor: Paolo Merialdo
Knowledge Graphs Augmentation
Natural Language Understanding
Who is the wife of James Bond?
From text to entities: Information Extraction in the Era of Knowledge Graphs
Who is James Bond in Dr. No and Goldfinger?
From text to entities: Information Extraction in the Era of Knowledge Graphs

Recommended

Tutorial semantic wikis and applications
Tutorial   semantic wikis and applicationsTutorial   semantic wikis and applications
Tutorial semantic wikis and applicationsMark Greaves
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
JSON-stat & JS: the JSON-stat Javascript Toolkit
JSON-stat & JS: the JSON-stat Javascript ToolkitJSON-stat & JS: the JSON-stat Javascript Toolkit
JSON-stat & JS: the JSON-stat Javascript ToolkitXavier Badosa
 
Radically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebRadically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebJulie Allinson
 
The CSO Open Data Experience
The CSO Open Data ExperienceThe CSO Open Data Experience
The CSO Open Data ExperienceDublinked .
 
Entity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionEntity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionJennifer D'Souza
 
Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)Frieda Brioschi
 

More Related Content

Similar to From text to entities: Information Extraction in the Era of Knowledge Graphs

Test Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsTest Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsHugh McCamphill
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textJennifer D'Souza
 
Test trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely testsTest trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely testsHugh McCamphill
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_bostonMongoDB
 
(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijekRatko Mutavdzic
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
2013 10-03-semantics-meetup-s buxton-mark_logic_pub
2013 10-03-semantics-meetup-s buxton-mark_logic_pub2013 10-03-semantics-meetup-s buxton-mark_logic_pub
2013 10-03-semantics-meetup-s buxton-mark_logic_pubStephen Buxton
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataJames Hendler
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics DemoOntotext
 
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetJumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetLauren Hayward Schaefer
 
DMDS Winter Workshop 2 Slides
DMDS Winter Workshop 2 SlidesDMDS Winter Workshop 2 Slides
DMDS Winter Workshop 2 SlidesPaige Morgan
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...Bertram Ludäscher
 
Schema & Design
Schema & DesignSchema & Design
Schema & DesignMongoDB
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015Michele Pasin
 

Similar to From text to entities: Information Extraction in the Era of Knowledge Graphs (20)

Test Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsTest Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely tests
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from text
 
Test trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely testsTest trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely tests
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_boston
 
Where is the World is my Open Government Data?
Where is the World is my Open Government Data?Where is the World is my Open Government Data?
Where is the World is my Open Government Data?
 
(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek
 
Carpenter "The Future of the Scholarly Record"
Carpenter "The Future of the Scholarly Record"Carpenter "The Future of the Scholarly Record"
Carpenter "The Future of the Scholarly Record"
 
Schema Design
Schema DesignSchema Design
Schema Design
 
2013 10-03-semantics-meetup-s buxton-mark_logic_pub
2013 10-03-semantics-meetup-s buxton-mark_logic_pub2013 10-03-semantics-meetup-s buxton-mark_logic_pub
2013 10-03-semantics-meetup-s buxton-mark_logic_pub
 
Schema Design
Schema DesignSchema Design
Schema Design
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of Metadata
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
 
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetJumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
 
Data Modeling with NGSI, NGSI-LD
Data Modeling with NGSI, NGSI-LDData Modeling with NGSI, NGSI-LD
Data Modeling with NGSI, NGSI-LD
 
DMDS Winter Workshop 2 Slides
DMDS Winter Workshop 2 SlidesDMDS Winter Workshop 2 Slides
DMDS Winter Workshop 2 Slides
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
 
Schema & Design
Schema & DesignSchema & Design
Schema & Design
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015
 

More from GraphRM

A gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksA gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksGraphRM
 
How to leverage Kafka data streams with Neo4j
How to leverage Kafka data streams with Neo4jHow to leverage Kafka data streams with Neo4j
How to leverage Kafka data streams with Neo4jGraphRM
 
From zero to gremlin hero - Part I
From zero to gremlin hero - Part IFrom zero to gremlin hero - Part I
From zero to gremlin hero - Part IGraphRM
 
Topology Visualization at Sysdig
Topology Visualization at SysdigTopology Visualization at Sysdig
Topology Visualization at SysdigGraphRM
 
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...GraphRM
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RGraphRM
 
The power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBThe power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBGraphRM
 
OrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteOrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteGraphRM
 
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaIl "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaGraphRM
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves GraphsGraphRM
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational databaseGraphRM
 
GraphRM - Introduzione al Graph modelling
GraphRM  - Introduzione al Graph modellingGraphRM  - Introduzione al Graph modelling
GraphRM - Introduzione al Graph modellingGraphRM
 
GraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphRM
 
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018GraphRM
 

More from GraphRM (14)

A gentle introduction to random and strategic networks
A gentle introduction to random and strategic networksA gentle introduction to random and strategic networks
A gentle introduction to random and strategic networks
 
How to leverage Kafka data streams with Neo4j
How to leverage Kafka data streams with Neo4jHow to leverage Kafka data streams with Neo4j
How to leverage Kafka data streams with Neo4j
 
From zero to gremlin hero - Part I
From zero to gremlin hero - Part IFrom zero to gremlin hero - Part I
From zero to gremlin hero - Part I
 
Topology Visualization at Sysdig
Topology Visualization at SysdigTopology Visualization at Sysdig
Topology Visualization at Sysdig
 
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
Tecniche per la Visualizzazione di Grafi di Grandi Dimensioni Basate sulla Co...
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
The power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDBThe power of the cosmos in a DB .... CosmosDB
The power of the cosmos in a DB .... CosmosDB
 
OrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancanteOrientDB graph e l'importanza di una relazione mancante
OrientDB graph e l'importanza di una relazione mancante
 
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione ItalianaIl "Knowledge Graph" della Pubblica Amministrazione Italiana
Il "Knowledge Graph" della Pubblica Amministrazione Italiana
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves Graphs
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational database
 
GraphRM - Introduzione al Graph modelling
GraphRM  - Introduzione al Graph modellingGraphRM  - Introduzione al Graph modelling
GraphRM - Introduzione al Graph modelling
 
GraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDBGraphQL ♥︎ GraphDB
GraphQL ♥︎ GraphDB
 
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
Costruiamo un motore di raccomandazione con Neo4J - Workshop 25/1/2018
 

Recently uploaded

A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)UNCResearchHub
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Cyber Security Experts
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Thibaud Le Douarin
 
What is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxWhat is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxJose Briones
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxVighnesh Shashtri
 
ppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxHizkiaJastis
 
Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxMdRafiqulIslam403212
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaAdrian Sanabria
 
Tips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsTips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsDataArchiva
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referencepriyansabari355
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)CUO VEERANAN VEERANAN
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfAustraliaChapterIIBA
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdfdigimartfamily
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023stephizcoolio
 
Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensKondapi V Siva Rama Brahmam
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for usersStephenEfange3
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referencepriyansabari355
 

Recently uploaded (18)

A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
 
What is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxWhat is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptx
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptx
 
Electricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptxElectricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptx
 
ppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptx
 
Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptx
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix Enigma
 
Tips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsTips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data Goals
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a reference
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdf
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023
 
Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample Screens
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for users
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as reference
 

From text to entities: Information Extraction in the Era of Knowledge Graphs

  • 1. From Information to Knowledge: an overview of public available Knowledge Graphs Matteo Cannaviccio mcannaviccio@gmail.com GraphRM LUISS Enlabs 23/07/2018
  • 2. Matteo Cannaviccio Ph.D. (2018) @ Roma Tre University supervisor: Paolo Merialdo Knowledge Graphs Augmentation Natural Language Understanding
  • 3. Who is the wife of James Bond?
  • 5. Who is James Bond in Dr. No and Goldfinger?
  • 7. Who was the president when Sean Connery born?
  • 9. Knowledge Graph: a representation of the real world
  • 10. Knowledge Graph: a more detailed representation mid23 mid26 mid13 mid14 mid17 City <type> Person <type> <type> Movie <type> <type> “London” “Sean Connery” “Σον Κόνερι” “Big Tam” “Terence Young“ “Goldfinger”” “Dr. No” “Agente 007 - Licenza di uccidere” “Agente 007 - Missione Goldfinger” <label> <label> <label> <label> <label> <label> <label> <label> <label> 25-08-1930 <birthDate> type entity relation literal/ value
  • 11. KGs enable many interesting applications ● Structured Search & Exploration ○ e.g. Google Knowledge Graph, Amazon Product Graph ● Graph Mining & Network Analysis ○ e.g. Facebook Entity Graph ● Big Data Integration ○ e.g. IBM Watson ● Many other IA applications
  • 13. Entity Exploration ~40% of all web queries are entity queries [Pound et al.,WWW2010] Knowing more about the entity of interest: ● finding entities related to entity of interest ● properties of entities ● going beyond immediate neighborhood of the entity ● Knowledge carousel
  • 16. Insights from Amazon Product Graph Product Graph + Knowledge Graph Credits to Xin Luna Dong Product Graph Knowledge Graph Product Graph Knowledge Graph
  • 17. Insights from Amazon Product Graph mid23 mid26 mid13 mid14 mid17 “London” City “Sean Connery” “Σον Κόνερι” “Big Tam” 25-08-1930 Person “Terence Young“ “Goldfinger”” “Dr. No” <type> <type> <type> <label> <label> <label> <label> <label> <label> <label> <birthDate> mid76 mid73 mid72 mid70 <product> <product> <product> <product> B0035QUXW F0067XILC7 F0067XILG8 <ASIN> <ASIN> <ASIN> Digital Movie DVD Blu-ray <type> <type> <type> <type> Credits to Xin Luna Dong
  • 18. Knowledge Graphs in Personal Assistants
  • 19. Integrating user information with KG Integrates information about the user with entities of the Knowledge Graph:
  • 20. Integrating users information with KG (Facebook Entity Graph)
  • 21. Network Analysis: Cerved Knowledge Graph
  • 22. KGs + Unstructured Data: IBM Watson On 14-15 February 2011, IBM Watson won the first place prize of $1 million at the Jeopardy Quiz Show It answers 66 times correctly and 9 incorrectly, using: ● Knowledge Graphs: - YAGO, DBpedia, Freebase ● 200M pages of unstructured content
  • 23. KGs enable many other IA applications ● Extract features for general learning tasks ● Scale with large (and noisy) training set ○ e.g. distant supervision AI System performance Hardware Software Structured Data
  • 25. Nodes: Entities Entity/Objects/Instances ● represent real world object, concrete or abstract <Sebastián_Piñera>, <Apocalypse_Now>, <Croatia_national_football_team>, <Samsung_S10>, <Detroit>, <Abbey_Road>, ...
  • 26. Nodes: Types ● Used to group entities based on shared characteristics ● Can be organized in a hierarchy
  • 27. Nodes: Values/Literals They can be seen as nodes containing primitive values, without a proper identifier ● date “1955-02-24 (xsd:date)”, “1861-03-17 (xsd:date)”, ... ● string “New York”, “New York City”, “NYC”, “Big Apple”, ... ● numeric “301338.0 (xsd:double)”, “27 (xsd:integer)”, ...
  • 28. Edge labels: Relations/Predicates A possible association: ● between entities <spouse>, <birthPlace>, <ceo>, <country>, ... ● between entities and types <type> ● between types <subClassOf> ● between entities and values/literals <foundingDate>, <areaTotal>, <birthDate>, <label>, ...
  • 29. Edges: Facts An instance of a relation, described with a triple (s, r, o): For example: <Steve_Jobs> <birthPlace> <San_Francisco> <Steve_Jobs> <type> <City> <Italy> <populationTotal> “69674003” S subject R relation O object entity entity type literal value
  • 30. RDF (Resource Description Framework) A standard model for data interchange on the Web ● framework for describing resources, recommended by W3C ● resources are identified with URIs ● properties are defined in an ontology ● only binary predicates <http://rdf.freebase.com/ns/m.0sl2rl8> <http://rdf.freebase.com/ns/music.recording.artist> <http://rdf.freebase.com/ns/m.07c0j> <http://rdf.freebase.com/ns/m.0wmkpxr> <http://rdf.freebase.com/ns/music.recording.artist> <http://rdf.freebase.com/ns/m.07c0j> <http://rdf.freebase.com/ns/m.0n36pr7> <http://rdf.freebase.com/ns/type.object.type> <http://rdf.freebase.com/ns/award.award_honor> <http://dbpedia.org/resource/Siemens> rdf:type <http://dbpedia.org/resource/Company> <http://dbpedia.org/resource/Siemens> rdf:label "Siemens"@de ; <http://dbpedia.org/resource/Siemens> <http://dbpedia.org/ontology/location> <http://dbpedia.org/resource/Munich>
  • 31. KGs & Linked Open Data (http://lod-cloud.net/) 47 M entities 425 M facts 1600 relations 5 M entities 120 M facts 100 relations 52 M entities 637 M facts 4000 relations 5 M entities 500 M facts 2800 relations
  • 32. How are they built?
  • 34. Wikipedia as a valuable resource id4: wiki/Composer id3id1 id4 id2 id1: wiki/Hans_Zimmer id3: wiki/Frankfurt id2: wiki/Germany
  • 35. KGs derived from Wikipedia
  • 36. Creating RDF triples from Wikipedia pages
  • 37. Creating RDF triples from info-boxes! {{Infobox person | birth_name = Steven Paul Jobs | birth_date = {{Birth date|1955|2|24}} | birth_place = [[San Francisco]], California, U.S. | death_date = {{Death date and age|2011|10|5|1955|2|24|mf=y}} | death_place = [[Palo Alto, California]], U.S. | alma_mater = [[Reed College]] | occupation = {{plainlist| * Co-founder, Chairman, and CEO of [[Apple Inc.]] * Primary investor and CEO of [[Pixar]] * Founder and CEO of [[NeXT]] }} | spouse = {{marriage|[[Laurene Powell]]<br>|1991|2011|}} }} Wiki Markup infobox
  • 38. Creating RDF triples from info-boxes! {{Infobox person | birth_name = Steven Paul Jobs | birth_date = {{Birth date|1955|2|24}} | birth_place = [[San Francisco]], California, U.S. | death_date = {{Death date and age|2011|10|5|1955|2|24|mf=y}} | death_place = [[Palo Alto, California]], U.S. | alma_mater = [[Reed College]] | occupation = {{plainlist| * Co-founder, Chairman, and CEO of [[Apple Inc.]] * Primary investor and CEO of [[Pixar]] * Founder and CEO of [[NeXT]] }} | spouse = {{marriage|[[Laurene Powell]]<br>|1991|2011|}} }} Wiki Markup infobox KG facts<dbr:Steve_Jobs> <rdf:type> <dbo:Person> <dbr:Steve_Jobs> <dbo:bornOn> “24-02-1955” <dbr:Steve_Jobs> <dbo:bornIn> <dbr:San_Francisco> <dbr:Steve_Jobs> <dbo:deadIn> <dbr:Palo_Alto,_California> <dbr:Steve_Jobs> <dbo:studyAt> <dbr:Reed_College> <dbr:Steve_Jobs> <dbo:occupation> <dbr:Apple_Inc.> <dbr:Steve_Jobs> <dbo:occupation> <dbr:Pixar> <dbr:Steve_Jobs> <dbo:occupation> <dbr:NeXT> <dbr:Steve_Jobs> <dbo:spouse> <dbr:Chrisann_Brennan> <dbr:Steve_Jobs> <rdf:label> “Steven Paul Jobs”
  • 40. DBpedia extractors Other than infobox, DBpedia Extractor Framework harvests data from: ● Title ● Abstract ● Geo-coordinates ● Categories ● Images ● Links (other languages, other Wiki pages, Web) ● Redirects ● Disambiguations
  • 41. KGs derived from Wikipedia ● Combines Wikipedia and WordNet ○ Accuracy estimated 95% ● Open source extraction framework ○ written in Java ● Data access: ○ Data dumps (RDF, TSV) ● Extract structured components: ○ infoboxes, categories, and more ○ crowd-sourced community effort ● Open source extraction framework ○ written in Scala and Java ● Data access: ○ Web ○ SPARQL Endpoint ○ Data dumps (RDF)
  • 43. Collaborative Approach ● Launched in 2007 by Metaweb ● Harvest from many LOD sources: ○ IMdb, MusicBrainz ● In May 2016: ○ ~52 M entities ○ ~600 M facts ● acquired by Google in July 2010 ○ core of Google KG ○ migrating contents to Wikidata ● Data access: ○ Data dump (RDF)
  • 44. The Beatles in Freebase
  • 45. Collaborative Approach ● Launched in 2007 by Metaweb ● Harvest from many LOD sources: ○ IMdb, MusicBrainz ● In May 2016: ○ ~52 M entities ○ ~600 M facts ● acquired by Google in July 2010 ○ core of Google KG ○ migrating contents to Wikidata ● Data access: ○ Data dump (RDF) ● Launched in 2012 by Wikimedia ○ “Wikipedia knowledge graph” ● In April 2018: ○ ~47M entities ○ ~425M statements ● Data access: ○ Web ○ MediaWiki API ○ Data dump (RDF, JSON, XML) ○ SPARQL Endpoint
  • 46. The Beatles in Wikidata
  • 47. The Beatles in Wikidata
  • 48. Wikidata Statements: Properties + Qualifiers [Q1299] [P740] [Q24826]
  • 49. Wikidata Statements: Properties + Qualifiers [Q1299] [P166] [P585] = “1967” [P1686] = [Q169226] [Q1027891]
  • 50. Wikidata to RDF [Erxleben et al., Int. Semantic Web Conf. 2014] ● Statements get own objects in graph ● Some simple statements also stored directly ● Each Wikidata property becomes many RDF properties ● Complex values get own objects too (not shown) [Q1299] wdt:[P166] [P585] = “1967” [P1686] = [Q169226] [Q1027891] [Q169226] pq:[P1686] 1967 pq:[P585] [Q1299dfb4478ff] ps:[P166] pv:[P166]
  • 52. “Which 19th century paintings show the moon?”
  • 53. “What are the world’s largest cities with a female mayor?”
  • 54. “Where are people born who travel the space?” (colour-coded by gender)
  • 55. References ● “YAGO: a multilingual knowledge base from Wikipedia, Wordnet, and Geonames”, Thomas Rebele, Fabian M. Suchanek, Johannes Hoffart, Joanna “Asia” Biega, Erdal Kuzey, Gerhard Weikum. International Semantic Web Conference (ISWC), 2016 ● “DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia”, Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer. Semantic Web Journal 6 (2): 167–195, 2015. ● “Wikidata: a free collaborative knowledgebase”, Denny Vrandečić, Markus Krötzsch. Communications of the ACM 57 (10) 78–85, 2014. ● “Introducing Wikidata to the Linked Data Web”, Fredo Erxlebanm Michael Gunther, Markus Krötzsch, Julian Mendez and Denny Vrandečić. Proceedings of the 13th International Semantic Web Conference - ISWC '14 ● “Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO”. Färber, M., Bartscherer, F., Menne, C. & Rettinger, A. (2017). Semantic Web Journal, , 1--53. ● “Reifying RDF: What Works Well With Wikidata?”. Hernández, D.; Hogan, A. & Krötzsch, M. (2015) in Thorsten Liebig & Achille Fokoue, ed., 'SSWS@ISWC' , CEUR-WS.org, , pp. 32-47. ● "From Freebase to Wikidata: The Great Migration", Thomas Pellissier Tanon, Denny Vrandečić, Sebastian Schaffert, Thomas Steiner, Lydia Pintscher. In Proceeding of the World Wide Web (WWW) Conference 2016.