SlideShare a Scribd company logo
Towards Purposeful Reuse of Semantic Datasets via
Goal-Driven Data Summarization
Panos Alexopoulos, Jose Manuel Gomez Perez
6th International Conference on Advances in Semantic Processing
Porto, Portugal, October 3rd, 2013
Introduction

The Linked Data Use Challenge

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

2
Introduction

Motivating Scenario

●Assume that some entity (individual or organization) wants to reuse
public semantic datasets from the Web to:

● Enrich with them its own data.
● Use the data to provide added-value services to its users/clients.
●These organizations can be:
● Technology providers (e.g. iSOCO)
● Information providers (e.g. publishers, media, etc.)
● Knowledge-driven and knowledge-intensive organisations

3
Data Enrichment

Why Data Reuse?

●The problem with semantic data is the high amount of time and effort
required to construct and maintain it.

●The reuse of existing public semantic data can (partially) alleviate
this problem:
● Their volume and diversity are increasing at high rates.
● Their maintenance and evolution is the responsibility of their
publishers, reducing the required efforts and costs for this task in
the organization's side

4
Data Enrichment

Example

●A news organization wants to create and maintain a knowledge base
about European Football.

●The pace at which this knowledge changes is quite fast meaning that
the organization needs to constantly monitor these changes and
update the data.
●Much of this information is already available as public semantic data
(e.g. DBPedia).
●Thus it could be better for the organization to reuse this public data
instead of creating them from scratch.

5
Data Enrichment

Barriers to Data Reuse
● Difficulty for knowledge engineers to decide whether a given dataset is
actually suitable for their needs.

● Semantic datasets typically cover diverse domains
● They do not follow a unified way of organizing the knowledge
● Differ in a number of features including size, coverage, granularity and
descriptiveness
● This makes difficult the following tasks:
● Assessing whether a dataset satisfies particular requirements
● Comparing different datasets to select which one is more suitable for a
given purpose.
6
Data Reuse

Our Approach

●We suggest the provision of the ability to data consumers to derive
semantic data summaries.

●Existing summarization approaches treat the summarization task in
an application and user independent way.
●By contrast, we are interested in facilitating the generation of
requirements-oriented and goal-driven summaries that may be
significantly more helpful to users.

7
Goal-Driven Semantic Data Summarization

Problem Description

●Key question: “Given an application scenario where semantic data is
required, how suitable is a given existing dataset for the purposes of
this scenario?”
●To answer this, users normally need to be able to:
1. Explicitly express the requirements that a dataset needs to
satisfy for a given task or goal.
2. Automatically measure/assess the extent to which a dataset
satisfies each of these requirements and compile a summary
report.

8
Goal-Driven Semantic Data Summarization

Approach

●To implement these two capabilities we follow a checklist-based
approach.

●Checklists are practically lists of action items arranged in a
systematic manner that allow users to record the completion of each
of them.
●They are widely applied across multiple industries, like healthcare or
aviation, to ensure reliable and consistent execution of complex
operations.
●In our case we apply checklists to define and execute custom
dataset summarization tasks in the form of lists of goal-specific
requirements and associated summarization processes.
9
Goal-Driven Semantic Data Summarization

Summarization Task Representation

●To represent custom summarization tasks according to the checklist
paradigm we have adopted the Minim model.

●This defines the following information:
● The Goals the dataset summarization task is designed to serve

● The Requirements against which the summarization task
evaluates the dataset.
● The Data Analysis Operations that the summarization task
employs in order to assess the satisfaction of its requirements

10
Goal-Driven Semantic Data Summarization

Example Goals

●Decide if a dataset is appropriate for a Semantic Annotation scenario.
●Decide if a dataset is appropriate for a Question Answering scenario
●Determine which of two or more similar datasets best represent a
given corpus.

●Detect arising inconsistencies or other quality problems.
●…

11
Goal-Driven Semantic Data Summarization

Example Requirements

●Evaluate the dataset’s coverage of a particular domain/topic:
Aims to measure the extent to which a dataset describes a given
domain or topic.
●Evaluate the dataset’s labeling adequacy and richness: Aims to
measure the extent to which the dataset’s elements (concepts,
instances, relations etc.) are accompanied by representative and
comprehensible labels, in one or more languages.
●Evaluate Connectivity: This requirement checks the existence of
paths between concepts or entities, i.e. whether it is possible to go
from a given concept to another on the graph and in what ways.

12
Goal-Driven Semantic Data Summarization

Example Data Operations

●Check the existence of a particular element (concept, relation,
attribute, instance, axiom) in the dataset.

●Check the dataset’s consistency (e.g. by running a reasoner).
●Measure the number of ambiguous entities in the dataset.

●Measure the number of labeled entities.

13
Goal-Driven Semantic Data Summarization

Application Example

●We applied our framework to assess the suitability of public datasets
for the purposes of reusing to semantically annotate texts describing
football matches from the Spanish League.
●For that, we wanted the dataset to be reused to
● Contain information about all the current teams of the Spanish
football league.
● All its entities to have at least one associated label and
● To relate teams with the players that current play in them.

14
Goal-Driven Semantic Data Summarization

Defined Summarization Task

15
Goal-Driven Semantic Data Summarization

Resulting Summary

● We executed this task against DBPedia and Freebase, automatically producing the
following summary report

● The system provides a yes/no answer as to whether each dataset satisfies each
requirement but also additional information on why this may or may not be the case.
● This is important because:

● A requirement might not be satisfied because of a high threshold
● A requirement might seem to be satisfied, yet that might not be actually true.

16
Ongoing Work

Summary Generation Tool
● We are currently developing a summarization tool that enables the definition
manipulation and execution of summarization tasks as well as the
dashboard-like visualization of their output

17
Thank you!

Questions?

Dr. Panos Alexopoulos
Semantic Applications Research
Manager

Quieres
innovar?

palexopoulos@isoco.com
(t)
+34 913 349 797

iSOCO Barcelona

iSOCO Madrid

iSOCO Pamplona

iSOCO Valencia

iSOCO Colombia

Av. Torre Blanca, 57
Edificio ESADE CREAPOLIS
Oficina 3C 15
08172 Sant Cugat del Vallès
Barcelona, España
(t) +34 935 677 200

Av. del Partenón, 16-18, 1º7ª
Campo de las Naciones
28042 Madrid
España
(t) +34 913 349 797

Parque Tomás
Caballero, 2, 6º4ª
31006 Pamplona
España
(t) +34 948 102 408

C/ Prof. Beltrán Báguena, 4
Oficina 107
46009 Valencia
España
(t) +34 963 467 143

Complejo Ruta N
Calle 67, 52-20
Piso 3, Torre A
Medellín
Colombia
(t) +57 516 7770 ext. 1132

Key Vendor
Virtual Assistant 2013

18

More Related Content

What's hot

Sentiment Analysis of Feedback Data
Sentiment Analysis of Feedback DataSentiment Analysis of Feedback Data
Sentiment Analysis of Feedback Data
ijtsrd
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
Journal For Research
 
295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysisZahid Azam
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
vivatechijri
 
Comparative Study on Lexicon-based sentiment analysers over Negative sentiment
Comparative Study on Lexicon-based sentiment analysers over Negative sentimentComparative Study on Lexicon-based sentiment analysers over Negative sentiment
Comparative Study on Lexicon-based sentiment analysers over Negative sentiment
AI Publications
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysissneha penmetsa
 
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
IRJET- Survey of Classification of Business Reviews using Sentiment AnalysisIRJET- Survey of Classification of Business Reviews using Sentiment Analysis
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
IRJET Journal
 
Bba q&a study final white
Bba q&a study final whiteBba q&a study final white
Bba q&a study final whiteGreg Sterling
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET Journal
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
Lalit Jain
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
IJECEIAES
 
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
Artificial Intelligence Institute at UofSC
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
IJSRD
 
Sentiment Features based Analysis of Online Reviews
Sentiment Features based Analysis of Online ReviewsSentiment Features based Analysis of Online Reviews
Sentiment Features based Analysis of Online Reviews
iosrjce
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
Amazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine LearningAmazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine Learning
ijtsrd
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
ijtsrd
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Amenda Joy
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
Pratibha Singh
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews Dataset
Maham F'Rajput
 

What's hot (20)

Sentiment Analysis of Feedback Data
Sentiment Analysis of Feedback DataSentiment Analysis of Feedback Data
Sentiment Analysis of Feedback Data
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
 
295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
 
Comparative Study on Lexicon-based sentiment analysers over Negative sentiment
Comparative Study on Lexicon-based sentiment analysers over Negative sentimentComparative Study on Lexicon-based sentiment analysers over Negative sentiment
Comparative Study on Lexicon-based sentiment analysers over Negative sentiment
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
 
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
IRJET- Survey of Classification of Business Reviews using Sentiment AnalysisIRJET- Survey of Classification of Business Reviews using Sentiment Analysis
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
 
Bba q&a study final white
Bba q&a study final whiteBba q&a study final white
Bba q&a study final white
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
 
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep ...
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
Sentiment Features based Analysis of Online Reviews
Sentiment Features based Analysis of Online ReviewsSentiment Features based Analysis of Online Reviews
Sentiment Features based Analysis of Online Reviews
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Amazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine LearningAmazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine Learning
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews Dataset
 

Similar to Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization

Advertising using big data
Advertising using big dataAdvertising using big data
Advertising using big data
RajathMk1
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
hktripathy
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...
RINUSATHYAN
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
SBGC
 
BigData Analysis
BigData AnalysisBigData Analysis
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
todd271
 
MVP (Minimum Viable Product) Readiness | Boost Labs
MVP (Minimum Viable Product) Readiness | Boost LabsMVP (Minimum Viable Product) Readiness | Boost Labs
MVP (Minimum Viable Product) Readiness | Boost Labs
Boost Labs
 
Statistical Databases
Statistical DatabasesStatistical Databases
Statistical Databases
ssuseraef7e0
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
Sandeep Garg
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET Journal
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
patiladiti752
 
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Ashnikbiz
 
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET Journal
 
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET Journal
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
Megan Espinoza
 
Link Between Strategy and BA Deployment Strategy and BA Scenarios
Link Between Strategy and BA Deployment Strategy and BA ScenariosLink Between Strategy and BA Deployment Strategy and BA Scenarios
Link Between Strategy and BA Deployment Strategy and BA Scenarios
Venkat .P
 

Similar to Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization (20)

Advertising using big data
Advertising using big dataAdvertising using big data
Advertising using big data
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
 
MVP (Minimum Viable Product) Readiness | Boost Labs
MVP (Minimum Viable Product) Readiness | Boost LabsMVP (Minimum Viable Product) Readiness | Boost Labs
MVP (Minimum Viable Product) Readiness | Boost Labs
 
Statistical Databases
Statistical DatabasesStatistical Databases
Statistical Databases
 
Sabyasachee_Kar_cv
Sabyasachee_Kar_cvSabyasachee_Kar_cv
Sabyasachee_Kar_cv
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
 
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...
 
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
 
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Link Between Strategy and BA Deployment Strategy and BA Scenarios
Link Between Strategy and BA Deployment Strategy and BA ScenariosLink Between Strategy and BA Deployment Strategy and BA Scenarios
Link Between Strategy and BA Deployment Strategy and BA Scenarios
 

Recently uploaded

Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 

Recently uploaded (20)

Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 

Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization

  • 1. Towards Purposeful Reuse of Semantic Datasets via Goal-Driven Data Summarization Panos Alexopoulos, Jose Manuel Gomez Perez 6th International Conference on Advances in Semantic Processing Porto, Portugal, October 3rd, 2013
  • 2. Introduction The Linked Data Use Challenge Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ 2
  • 3. Introduction Motivating Scenario ●Assume that some entity (individual or organization) wants to reuse public semantic datasets from the Web to: ● Enrich with them its own data. ● Use the data to provide added-value services to its users/clients. ●These organizations can be: ● Technology providers (e.g. iSOCO) ● Information providers (e.g. publishers, media, etc.) ● Knowledge-driven and knowledge-intensive organisations 3
  • 4. Data Enrichment Why Data Reuse? ●The problem with semantic data is the high amount of time and effort required to construct and maintain it. ●The reuse of existing public semantic data can (partially) alleviate this problem: ● Their volume and diversity are increasing at high rates. ● Their maintenance and evolution is the responsibility of their publishers, reducing the required efforts and costs for this task in the organization's side 4
  • 5. Data Enrichment Example ●A news organization wants to create and maintain a knowledge base about European Football. ●The pace at which this knowledge changes is quite fast meaning that the organization needs to constantly monitor these changes and update the data. ●Much of this information is already available as public semantic data (e.g. DBPedia). ●Thus it could be better for the organization to reuse this public data instead of creating them from scratch. 5
  • 6. Data Enrichment Barriers to Data Reuse ● Difficulty for knowledge engineers to decide whether a given dataset is actually suitable for their needs. ● Semantic datasets typically cover diverse domains ● They do not follow a unified way of organizing the knowledge ● Differ in a number of features including size, coverage, granularity and descriptiveness ● This makes difficult the following tasks: ● Assessing whether a dataset satisfies particular requirements ● Comparing different datasets to select which one is more suitable for a given purpose. 6
  • 7. Data Reuse Our Approach ●We suggest the provision of the ability to data consumers to derive semantic data summaries. ●Existing summarization approaches treat the summarization task in an application and user independent way. ●By contrast, we are interested in facilitating the generation of requirements-oriented and goal-driven summaries that may be significantly more helpful to users. 7
  • 8. Goal-Driven Semantic Data Summarization Problem Description ●Key question: “Given an application scenario where semantic data is required, how suitable is a given existing dataset for the purposes of this scenario?” ●To answer this, users normally need to be able to: 1. Explicitly express the requirements that a dataset needs to satisfy for a given task or goal. 2. Automatically measure/assess the extent to which a dataset satisfies each of these requirements and compile a summary report. 8
  • 9. Goal-Driven Semantic Data Summarization Approach ●To implement these two capabilities we follow a checklist-based approach. ●Checklists are practically lists of action items arranged in a systematic manner that allow users to record the completion of each of them. ●They are widely applied across multiple industries, like healthcare or aviation, to ensure reliable and consistent execution of complex operations. ●In our case we apply checklists to define and execute custom dataset summarization tasks in the form of lists of goal-specific requirements and associated summarization processes. 9
  • 10. Goal-Driven Semantic Data Summarization Summarization Task Representation ●To represent custom summarization tasks according to the checklist paradigm we have adopted the Minim model. ●This defines the following information: ● The Goals the dataset summarization task is designed to serve ● The Requirements against which the summarization task evaluates the dataset. ● The Data Analysis Operations that the summarization task employs in order to assess the satisfaction of its requirements 10
  • 11. Goal-Driven Semantic Data Summarization Example Goals ●Decide if a dataset is appropriate for a Semantic Annotation scenario. ●Decide if a dataset is appropriate for a Question Answering scenario ●Determine which of two or more similar datasets best represent a given corpus. ●Detect arising inconsistencies or other quality problems. ●… 11
  • 12. Goal-Driven Semantic Data Summarization Example Requirements ●Evaluate the dataset’s coverage of a particular domain/topic: Aims to measure the extent to which a dataset describes a given domain or topic. ●Evaluate the dataset’s labeling adequacy and richness: Aims to measure the extent to which the dataset’s elements (concepts, instances, relations etc.) are accompanied by representative and comprehensible labels, in one or more languages. ●Evaluate Connectivity: This requirement checks the existence of paths between concepts or entities, i.e. whether it is possible to go from a given concept to another on the graph and in what ways. 12
  • 13. Goal-Driven Semantic Data Summarization Example Data Operations ●Check the existence of a particular element (concept, relation, attribute, instance, axiom) in the dataset. ●Check the dataset’s consistency (e.g. by running a reasoner). ●Measure the number of ambiguous entities in the dataset. ●Measure the number of labeled entities. 13
  • 14. Goal-Driven Semantic Data Summarization Application Example ●We applied our framework to assess the suitability of public datasets for the purposes of reusing to semantically annotate texts describing football matches from the Spanish League. ●For that, we wanted the dataset to be reused to ● Contain information about all the current teams of the Spanish football league. ● All its entities to have at least one associated label and ● To relate teams with the players that current play in them. 14
  • 15. Goal-Driven Semantic Data Summarization Defined Summarization Task 15
  • 16. Goal-Driven Semantic Data Summarization Resulting Summary ● We executed this task against DBPedia and Freebase, automatically producing the following summary report ● The system provides a yes/no answer as to whether each dataset satisfies each requirement but also additional information on why this may or may not be the case. ● This is important because: ● A requirement might not be satisfied because of a high threshold ● A requirement might seem to be satisfied, yet that might not be actually true. 16
  • 17. Ongoing Work Summary Generation Tool ● We are currently developing a summarization tool that enables the definition manipulation and execution of summarization tasks as well as the dashboard-like visualization of their output 17
  • 18. Thank you! Questions? Dr. Panos Alexopoulos Semantic Applications Research Manager Quieres innovar? palexopoulos@isoco.com (t) +34 913 349 797 iSOCO Barcelona iSOCO Madrid iSOCO Pamplona iSOCO Valencia iSOCO Colombia Av. Torre Blanca, 57 Edificio ESADE CREAPOLIS Oficina 3C 15 08172 Sant Cugat del Vallès Barcelona, España (t) +34 935 677 200 Av. del Partenón, 16-18, 1º7ª Campo de las Naciones 28042 Madrid España (t) +34 913 349 797 Parque Tomás Caballero, 2, 6º4ª 31006 Pamplona España (t) +34 948 102 408 C/ Prof. Beltrán Báguena, 4 Oficina 107 46009 Valencia España (t) +34 963 467 143 Complejo Ruta N Calle 67, 52-20 Piso 3, Torre A Medellín Colombia (t) +57 516 7770 ext. 1132 Key Vendor Virtual Assistant 2013 18