SlideShare a Scribd company logo
DBpedia (in) ALIGNED
From DBpedia to DBpedia+
Dimitris Kontokostas
AKSW Group, Leipzig University
DBpedia Association
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2007
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2008
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2009
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2010
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2011
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia @ 2014
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDF Stats (2014 release)
3B facts (only 580M facts in English)
● DBpedia En: 4.58M Things / 4.22M typed
● 125 Localized versions: 38.3M Things
● 50M links to other datasets
Many more stats @:
dbpedia.org/Datasets2014/DatasetStatistics
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Dev Stats
DBpedia Information Extraction Framework
● Java/Scala based framework
○ Old PHP-based framework
● 5.1K Commits
● 52K lines of code (100K/1M AT)
● 71 total contributors
Many more stats @:
www.openhub.net/p/dbpedia
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Aligning Problem
Lot’s of code & a lot more data
● Wikipedia evolves over time
○ Infobox Templates change, merge, deleted
○ New formatting templates
○ Structural differences per language edition
● Code should adapt to all the changes
○ hard at this (data) scale
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Unit-testing to the rescue?
● Software & Data testing
● Straightforward for software (since 70’s)
● Preliminary for (RDF) data
○ RDFUnit, SPIN, OWL, PelletICV, ShEx,...
■ W3C Data Shapes WG
Data testing++
● Generation: manual, (Semi)automatic, ...
● Linking: data & software tests
February 9th 2015 / 3rd DBpedia Meeting in Dublin
RDFUnit
http://rdfunit.aksw.org
February 9th 2015 / 3rd DBpedia Meeting in Dublin
UT feedback loop
Data verification and feedback at different
data extraction stages
● Three main points of failure in DBpedia:
○ Code
○ Infobox mappings
○ Wikipedia (!!!)
February 9th 2015 / 3rd DBpedia Meeting in Dublin
DBpedia+
Workflow
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Additional feedback
We are looking into:
● Reporting
● Statistics
● Inter-Wikipedia cross-checking
● ML techniques
February 9th 2015 / 3rd DBpedia Meeting in Dublin
Thank you & Questions?
ALIGNED
Aligned, Quality-centric Software and Data
Engineering

More Related Content

What's hot

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Fariz Darari
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
Christophe Guéret
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018
Fast Reports
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
Dimitris Kontokostas
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
bodaceacat
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Sergio Fernández
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
Navid Sedighpour
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
WARCnet
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1GraylinKim
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
eXascale Infolab
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK
Magnus Manske
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used Terms
DigitalLibraryServices
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status Update
openCypher
 
Data quality in Real Estate
Data quality in Real EstateData quality in Real Estate
Data quality in Real Estate
Dimitris Kontokostas
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redis
inovia
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
mbruemmer
 
Wikidata
WikidataWikidata
Wikidata
Anja Jentzsch
 

What's hot (19)

Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018FastReport VCL6 Nuremberg 2018
FastReport VCL6 Nuremberg 2018
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
Wikidata & dbpedia
Wikidata & dbpediaWikidata & dbpedia
Wikidata & dbpedia
 
Session 03 acquiring data
Session 03 acquiring dataSession 03 acquiring data
Session 03 acquiring data
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4j
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1Open Legislation Spring 2011 Talk 1
Open Legislation Spring 2011 Talk 1
 
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of DatadipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
dipLODocus[RDF]: Short and Long-Tail RDF Analytics for Massive Webs of Data
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK
 
RDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used TermsRDM Jargon Busting Session: Demystifying Commonly Used Terms
RDM Jargon Busting Session: Demystifying Commonly Used Terms
 
Eighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status UpdateEighth openCypher Implementers Group Meeting: Status Update
Eighth openCypher Implementers Group Meeting: Status Update
 
Data quality in Real Estate
Data quality in Real EstateData quality in Real Estate
Data quality in Real Estate
 
Steam Learn: An introduction to Redis
Steam Learn: An introduction to RedisSteam Learn: An introduction to Redis
Steam Learn: An introduction to Redis
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
 
Wikidata
WikidataWikidata
Wikidata
 

Viewers also liked

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
Dimitris Kontokostas
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016
Dimitris Kontokostas
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
Dimitris Kontokostas
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
Dimitris Kontokostas
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
Dimitris Kontokostas
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
Dimitris Kontokostas
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
Dimitris Kontokostas
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
Dimitris Kontokostas
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
andimou
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015
SemanticCode
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkg
Wolters Kluwer Germany
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge Chapter
Bianca Pereira
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia Meetup
Sujan Perera
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final
Gerard Kuys
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating Entities
Julien PLU
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2
Gerard Kuys
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015
Net7
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
Sebastian Hellmann
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloud
Fumihiro Kato
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
Antoine Isaac
 

Viewers also liked (20)

DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015Missingbot DBpedia Meeting Dublin 2015
Missingbot DBpedia Meeting Dublin 2015
 
D bpedia association meeting dublin wkg
D bpedia association meeting dublin wkgD bpedia association meeting dublin wkg
D bpedia association meeting dublin wkg
 
DBpedia as Gaeilge Chapter
DBpedia as Gaeilge ChapterDBpedia as Gaeilge Chapter
DBpedia as Gaeilge Chapter
 
Linking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia MeetupLinking Implicit entities - DBpedia Meetup
Linking Implicit entities - DBpedia Meetup
 
20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final20140130 metadata vocabularies_and_cultural_heritage_final
20140130 metadata vocabularies_and_cultural_heritage_final
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating Entities
 
20150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v220150209 improving the_d_bpedia_ontology_v2
20150209 improving the_d_bpedia_ontology_v2
 
Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015Pundit at 3rd DBpedia Community Meeting 2015
Pundit at 3rd DBpedia Community Meeting 2015
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
DBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloudDBpedia in the Japanese LOD cloud
DBpedia in the Japanese LOD cloud
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
 

Similar to DBpedia+ / DBpedia meeting in Dublin

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
Sebastian Hellmann
 
DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New Release
Volha Bryl
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realization
Seonho Kim
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
Marin Dimitrov
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
Connected Data World
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
Oscar Corcho
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
robin fay
 
Linked Data
Linked DataLinked Data
Linked Data
Anja Jentzsch
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
Juan Sequeda
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Evangelos Kalampokis
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
SSSW
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
Sebastian Hellmann
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptx
Anshika865276
 
4-Managing CrossRef DOIs
4-Managing CrossRef DOIs4-Managing CrossRef DOIs
4-Managing CrossRef DOIs
Academy of Science of South Africa
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI Environment
Cloudera, Inc.
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
SSSW
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD
Diego Valerio Camarda
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
Marin Dimitrov
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Joachim Neubert
 

Similar to DBpedia+ / DBpedia meeting in Dublin (20)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
DBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New ReleaseDBpedia 2014: Highlights and Issues of the New Release
DBpedia 2014: Highlights and Issues of the New Release
 
Linked open data, its realization
Linked open data, its realizationLinked open data, its realization
Linked open data, its realization
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
Linked Data
Linked DataLinked Data
Linked Data
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptx
 
4-Managing CrossRef DOIs
4-Managing CrossRef DOIs4-Managing CrossRef DOIs
4-Managing CrossRef DOIs
 
Integrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI EnvironmentIntegrating Hadoop in Your Existing DW and BI Environment
Integrating Hadoop in Your Existing DW and BI Environment
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD(Enterprise) Linked Data Platform a new standard to manage LOD
(Enterprise) Linked Data Platform a new standard to manage LOD
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for EconomicsLeveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

DBpedia+ / DBpedia meeting in Dublin

  • 1. DBpedia (in) ALIGNED From DBpedia to DBpedia+ Dimitris Kontokostas AKSW Group, Leipzig University DBpedia Association
  • 2. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2007
  • 3. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2008
  • 4. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2009
  • 5. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2010
  • 6. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2011
  • 7. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia @ 2014
  • 8. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDF Stats (2014 release) 3B facts (only 580M facts in English) ● DBpedia En: 4.58M Things / 4.22M typed ● 125 Localized versions: 38.3M Things ● 50M links to other datasets Many more stats @: dbpedia.org/Datasets2014/DatasetStatistics
  • 9. February 9th 2015 / 3rd DBpedia Meeting in Dublin Dev Stats DBpedia Information Extraction Framework ● Java/Scala based framework ○ Old PHP-based framework ● 5.1K Commits ● 52K lines of code (100K/1M AT) ● 71 total contributors Many more stats @: www.openhub.net/p/dbpedia
  • 10. February 9th 2015 / 3rd DBpedia Meeting in Dublin Aligning Problem Lot’s of code & a lot more data ● Wikipedia evolves over time ○ Infobox Templates change, merge, deleted ○ New formatting templates ○ Structural differences per language edition ● Code should adapt to all the changes ○ hard at this (data) scale
  • 11. February 9th 2015 / 3rd DBpedia Meeting in Dublin Unit-testing to the rescue? ● Software & Data testing ● Straightforward for software (since 70’s) ● Preliminary for (RDF) data ○ RDFUnit, SPIN, OWL, PelletICV, ShEx,... ■ W3C Data Shapes WG Data testing++ ● Generation: manual, (Semi)automatic, ... ● Linking: data & software tests
  • 12. February 9th 2015 / 3rd DBpedia Meeting in Dublin RDFUnit http://rdfunit.aksw.org
  • 13. February 9th 2015 / 3rd DBpedia Meeting in Dublin UT feedback loop Data verification and feedback at different data extraction stages ● Three main points of failure in DBpedia: ○ Code ○ Infobox mappings ○ Wikipedia (!!!)
  • 14. February 9th 2015 / 3rd DBpedia Meeting in Dublin DBpedia+ Workflow
  • 15. February 9th 2015 / 3rd DBpedia Meeting in Dublin Additional feedback We are looking into: ● Reporting ● Statistics ● Inter-Wikipedia cross-checking ● ML techniques
  • 16. February 9th 2015 / 3rd DBpedia Meeting in Dublin Thank you & Questions? ALIGNED Aligned, Quality-centric Software and Data Engineering