SlideShare a Scribd company logo
1 of 16
Download to read offline
BowlognaBench	
  
Benchmarking	
  RDF	
  Analy5cs	
  
Gianluca	
  Demar5ni,	
  Iliya	
  Enchev,	
  
Joël	
  Gapany,	
  and	
  Philippe	
  Cudré-­‐Mauroux	
  
	
  eXascale	
  Infolab	
  &	
  Faculty	
  of	
  Humani5es	
  
University	
  of	
  Fribourg,	
  Switzerland	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   1	
  
Mo5va5on	
  
•  Seman5c	
  Data	
  keeps	
  increasing	
  on	
  the	
  Web	
  
•  More	
  common	
  is	
  the	
  need	
  to	
  run	
  OLAP-­‐type	
  
queries	
  
– How	
  did	
  university	
  student	
  performance	
  evolve	
  
over	
  last	
  5	
  years?	
  
•  A	
  novel	
  benchmark	
  for	
  Knowledge	
  Bases	
  
focusing	
  on	
  complex	
  Analy5cs	
  queries	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   2	
  
Why	
  do	
  we	
  need	
  a	
  new	
  RDF	
  
benchmark?	
  
•  Exis5ng	
  RDF	
  benchmarks	
  (e.g.,	
  LUBM)	
  
– Don’t	
  deal	
  with	
  complex	
  analy5c	
  queries	
  
– Don’t	
  look	
  at	
  the	
  temporal	
  dimension	
  
– Don’t	
  model	
  a	
  realis5c	
  se_ng	
  
•  Analy5c	
  benchmarks	
  exist	
  for	
  rela5onal	
  
systems	
  (e.g.	
  TPC-­‐H)	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   3	
  
The	
  Bologna	
  Reform	
  
•  Started	
  in	
  June	
  1999	
  
•  Framework	
  for	
  higher	
  educa5on	
  systems	
  
•  47	
  Countries	
  
•  Common	
  academic	
  degrees	
  
•  Common	
  study	
  structure	
  
•  Common	
  terminology	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   4	
  
The	
  university	
  se_ng	
  ader	
  Bologna	
  
•  A	
  lot	
  of	
  data	
  is	
  available	
  
–  Not	
  following	
  standard	
  schemas	
  
–  Comprehensive	
  and	
  available	
  data	
  is	
  a	
  success	
  factor	
  
	
  
•  Shared	
  data	
  
–  Erasmus	
  exchanges	
  
–  Courses	
  in	
  a	
  given	
  language	
  
•  Analy5c	
  tools	
  may	
  help	
  monitoring	
  university	
  
performance	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   5	
  
An	
  ontology	
  about	
  Bologna 	
  	
  
•  A	
  Lexicon	
  for	
  the	
  Bologna	
  Reform	
  
– Basic	
  set	
  of	
  terms	
  for	
  the	
  new	
  system	
  
– Stable	
  across	
  5me	
  and	
  ins5tu5ons	
  
– Developed	
  by	
  a	
  professional	
  terminologist	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   6	
  
The	
  ontology	
  crea5on	
  process	
  	
  
•  The	
  Bowlogna	
  Ontology	
  
– 29	
  top	
  classes	
  (67	
  in	
  total)	
  
– Classes:	
  student,	
  professor,	
  evalua5on,	
  teaching	
  
unit,	
  ECTS	
  credit,	
  semester,	
  etc.	
  
– Concept	
  defini5ons	
  in	
  English,	
  French,	
  German	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   7	
  
Bowlogna	
  Ontology	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   8	
  
Bowlogna	
  Ontology	
  
•  Private	
  /	
  Public	
  parts	
  
– Public	
  data	
  can	
  be	
  shared	
  with	
  other	
  uni	
  (e.g.,	
  
course	
  descrip5ons)	
  
– Private	
  data	
  in	
  sensible	
  (e.g.,	
  evalua5on	
  results)	
  
•  Private	
  data	
  might	
  contain	
  more	
  instances	
  
•  Aggrega5ons	
  over	
  private	
  data	
  may	
  be	
  shared	
  
(e.g.,	
  number	
  of	
  enrolled	
  students)	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   9	
  
The	
  Benchmark	
  
•  Bowlogna	
  Ontology	
  
– 67	
  classes	
  
•  12	
  Analy5cs	
  queries	
  
– Natural	
  language	
  and	
  SPARQL	
  transla5on	
  
•  Automa5c	
  Instance	
  Generator	
  
– Populated	
  ontology	
  with	
  given	
  num	
  of	
  instances	
  
•  Test	
  over	
  num	
  of	
  instances	
  and	
  universi5es	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   10	
  
Analy5c	
  Queries	
  
•  Count	
  
•  Molecule	
  
–  Query	
  4.	
  Return	
  all	
  informa5on	
  about	
  Student0	
  within	
  a	
  
scope	
  of	
  two	
  
•  Max	
  Min	
  
•  Ranking	
  and	
  TopK	
  	
  
•  Temporal	
  
–  Query	
  8.	
  What	
  is	
  the	
  average	
  comple5on	
  5me	
  of	
  Bachelor	
  
studies	
  for	
  each	
  Study	
  Track?	
  
•  Path	
  
•  Mul5ple	
  Universi5es	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   11	
  
Query	
  Classifica5on	
  
we classify a query as having a large input size if it involves more than 5%
of instances, and small otherwise. Selectivity measures the amount of instances
that match the query: we classify a query as having high selectivity if less than
10% of instances match the query, and low otherwise. Complexity measures the
amount of classes and properties involved in the query: queries are classified as
having high or low complexity accordingly to the RDF schema we have defined.
Table 1. Classification of queries according to their need to access private and public
data, input size, selectivity, and complexity.
Count Molecule MaxMin TopK Temp Path MultiUniv
Query 1 2 3 4 5 6 7 8 9 10 11 12
Public x x x x x x x x
Private x x x x x x x x
Input Size Small Large Small Small Large Small Large Large Large Large Large Large
Selectivity High Low Low Low Low Low High Low Low Low Low Low
Complexity Low Low Low High High Low High Low High High Low High
As we can see the majority of queries have a low selectivity which reflects our
intent of performing analytic queries, that is, queries for which a lot of data is
retrieved and aggregated. For the same reason, most of the queries have a large
input. Finally, queries are equally divided in high and low complexities.30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   12	
  
From	
  the	
  process	
  analyst	
  point	
  of	
  view	
  
•  Which	
  system	
  should	
  I	
  pick	
  for	
  my	
  specific	
  
problem?	
  
– Not	
  looking	
  for	
  the	
  best	
  system	
  
– Look	
  at	
  Problem-­‐specific	
  query	
  sets	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   13	
  
Which	
  system	
  to	
  use?	
  
Count	
  
Queries	
  
Path	
  
Queries	
  
Rank	
  
Queries	
  
Temporal	
  
Queries	
  
System	
  A	
   0.5s	
   5s	
   0.1s	
   2s	
  
System	
  B	
   3s	
   0.4s	
   2s	
   1s	
  
System	
  C	
   0.5s	
   0.5s	
   0.5s	
   0.5s	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   14	
  
Conclusions	
  
•  BowlognaBench	
  for	
  Analy5c	
  Queries	
  
•  OWL	
  Ontology	
  for	
  Higher	
  Educa5on	
  Systems	
  
•  Next	
  Steps	
  
– Run	
  a	
  compara5ve	
  evalua5on	
  of	
  RDF	
  systems	
  
– Set	
  up	
  a	
  wiki-­‐like	
  space	
  where	
  groups	
  can	
  upload	
  
experimental	
  results	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   15	
  
hlp://diuf.unifr.ch/xi/bowlognabench/	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   16	
  

More Related Content

Similar to RDF Analytics Benchmarking Framework

Beyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeBeyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeFutureLearn FLAN
 
Presentation on Software process improvement in GSD
Presentation on Software process improvement in GSDPresentation on Software process improvement in GSD
Presentation on Software process improvement in GSDRafi Ullah
 
Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Alison Hitchens
 
Learning analytics exemplar template
Learning analytics exemplar templateLearning analytics exemplar template
Learning analytics exemplar templateSimon Buckingham Shum
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical MehodsM Surendar
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxTayeDosane
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolioAzmi Mohd Tamil
 
RESEARCH METHODOLOGY
RESEARCH METHODOLOGYRESEARCH METHODOLOGY
RESEARCH METHODOLOGYbesirdernjani
 
Lecture 10.12.10
Lecture 10.12.10Lecture 10.12.10
Lecture 10.12.10VMRoberts
 
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...Dominik Kowald
 
Recommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumRecommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumJonathas Magalhães
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Michael Levine-Clark
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
 
Learning Analytics: Realizing their Promise in the California State University
Learning Analytics:  Realizing their Promise in the California State UniversityLearning Analytics:  Realizing their Promise in the California State University
Learning Analytics: Realizing their Promise in the California State UniversityJohn Whitmer, Ed.D.
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentationAnne Adams
 
Research methodology week03
Research methodology week03Research methodology week03
Research methodology week03swati kundabor
 
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Christoph Rensing
 

Similar to RDF Analytics Benchmarking Framework (20)

Beyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeBeyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for Change
 
Presentation on Software process improvement in GSD
Presentation on Software process improvement in GSDPresentation on Software process improvement in GSD
Presentation on Software process improvement in GSD
 
Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)
 
Learning analytics exemplar template
Learning analytics exemplar templateLearning analytics exemplar template
Learning analytics exemplar template
 
Learning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA caseLearning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA case
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolio
 
RESEARCH METHODOLOGY
RESEARCH METHODOLOGYRESEARCH METHODOLOGY
RESEARCH METHODOLOGY
 
RM3.ppt
RM3.pptRM3.ppt
RM3.ppt
 
Lecture 10.12.10
Lecture 10.12.10Lecture 10.12.10
Lecture 10.12.10
 
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
 
Recommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumRecommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User Curriculum
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
 
Learning Analytics: Realizing their Promise in the California State University
Learning Analytics:  Realizing their Promise in the California State UniversityLearning Analytics:  Realizing their Promise in the California State University
Learning Analytics: Realizing their Promise in the California State University
 
data interpretation
data interpretationdata interpretation
data interpretation
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentation
 
Research methodology week03
Research methodology week03Research methodology week03
Research methodology week03
 
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
 

More from eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 

More from eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 

Recently uploaded

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

RDF Analytics Benchmarking Framework

  • 1. BowlognaBench   Benchmarking  RDF  Analy5cs   Gianluca  Demar5ni,  Iliya  Enchev,   Joël  Gapany,  and  Philippe  Cudré-­‐Mauroux    eXascale  Infolab  &  Faculty  of  Humani5es   University  of  Fribourg,  Switzerland   30-­‐Jun-­‐11   Gianluca  Demar5ni   1  
  • 2. Mo5va5on   •  Seman5c  Data  keeps  increasing  on  the  Web   •  More  common  is  the  need  to  run  OLAP-­‐type   queries   – How  did  university  student  performance  evolve   over  last  5  years?   •  A  novel  benchmark  for  Knowledge  Bases   focusing  on  complex  Analy5cs  queries   30-­‐Jun-­‐11   Gianluca  Demar5ni   2  
  • 3. Why  do  we  need  a  new  RDF   benchmark?   •  Exis5ng  RDF  benchmarks  (e.g.,  LUBM)   – Don’t  deal  with  complex  analy5c  queries   – Don’t  look  at  the  temporal  dimension   – Don’t  model  a  realis5c  se_ng   •  Analy5c  benchmarks  exist  for  rela5onal   systems  (e.g.  TPC-­‐H)   30-­‐Jun-­‐11   Gianluca  Demar5ni   3  
  • 4. The  Bologna  Reform   •  Started  in  June  1999   •  Framework  for  higher  educa5on  systems   •  47  Countries   •  Common  academic  degrees   •  Common  study  structure   •  Common  terminology   30-­‐Jun-­‐11   Gianluca  Demar5ni   4  
  • 5. The  university  se_ng  ader  Bologna   •  A  lot  of  data  is  available   –  Not  following  standard  schemas   –  Comprehensive  and  available  data  is  a  success  factor     •  Shared  data   –  Erasmus  exchanges   –  Courses  in  a  given  language   •  Analy5c  tools  may  help  monitoring  university   performance   30-­‐Jun-­‐11   Gianluca  Demar5ni   5  
  • 6. An  ontology  about  Bologna     •  A  Lexicon  for  the  Bologna  Reform   – Basic  set  of  terms  for  the  new  system   – Stable  across  5me  and  ins5tu5ons   – Developed  by  a  professional  terminologist   30-­‐Jun-­‐11   Gianluca  Demar5ni   6  
  • 7. The  ontology  crea5on  process     •  The  Bowlogna  Ontology   – 29  top  classes  (67  in  total)   – Classes:  student,  professor,  evalua5on,  teaching   unit,  ECTS  credit,  semester,  etc.   – Concept  defini5ons  in  English,  French,  German   30-­‐Jun-­‐11   Gianluca  Demar5ni   7  
  • 8. Bowlogna  Ontology   30-­‐Jun-­‐11   Gianluca  Demar5ni   8  
  • 9. Bowlogna  Ontology   •  Private  /  Public  parts   – Public  data  can  be  shared  with  other  uni  (e.g.,   course  descrip5ons)   – Private  data  in  sensible  (e.g.,  evalua5on  results)   •  Private  data  might  contain  more  instances   •  Aggrega5ons  over  private  data  may  be  shared   (e.g.,  number  of  enrolled  students)   30-­‐Jun-­‐11   Gianluca  Demar5ni   9  
  • 10. The  Benchmark   •  Bowlogna  Ontology   – 67  classes   •  12  Analy5cs  queries   – Natural  language  and  SPARQL  transla5on   •  Automa5c  Instance  Generator   – Populated  ontology  with  given  num  of  instances   •  Test  over  num  of  instances  and  universi5es   30-­‐Jun-­‐11   Gianluca  Demar5ni   10  
  • 11. Analy5c  Queries   •  Count   •  Molecule   –  Query  4.  Return  all  informa5on  about  Student0  within  a   scope  of  two   •  Max  Min   •  Ranking  and  TopK     •  Temporal   –  Query  8.  What  is  the  average  comple5on  5me  of  Bachelor   studies  for  each  Study  Track?   •  Path   •  Mul5ple  Universi5es   30-­‐Jun-­‐11   Gianluca  Demar5ni   11  
  • 12. Query  Classifica5on   we classify a query as having a large input size if it involves more than 5% of instances, and small otherwise. Selectivity measures the amount of instances that match the query: we classify a query as having high selectivity if less than 10% of instances match the query, and low otherwise. Complexity measures the amount of classes and properties involved in the query: queries are classified as having high or low complexity accordingly to the RDF schema we have defined. Table 1. Classification of queries according to their need to access private and public data, input size, selectivity, and complexity. Count Molecule MaxMin TopK Temp Path MultiUniv Query 1 2 3 4 5 6 7 8 9 10 11 12 Public x x x x x x x x Private x x x x x x x x Input Size Small Large Small Small Large Small Large Large Large Large Large Large Selectivity High Low Low Low Low Low High Low Low Low Low Low Complexity Low Low Low High High Low High Low High High Low High As we can see the majority of queries have a low selectivity which reflects our intent of performing analytic queries, that is, queries for which a lot of data is retrieved and aggregated. For the same reason, most of the queries have a large input. Finally, queries are equally divided in high and low complexities.30-­‐Jun-­‐11   Gianluca  Demar5ni   12  
  • 13. From  the  process  analyst  point  of  view   •  Which  system  should  I  pick  for  my  specific   problem?   – Not  looking  for  the  best  system   – Look  at  Problem-­‐specific  query  sets   30-­‐Jun-­‐11   Gianluca  Demar5ni   13  
  • 14. Which  system  to  use?   Count   Queries   Path   Queries   Rank   Queries   Temporal   Queries   System  A   0.5s   5s   0.1s   2s   System  B   3s   0.4s   2s   1s   System  C   0.5s   0.5s   0.5s   0.5s   30-­‐Jun-­‐11   Gianluca  Demar5ni   14  
  • 15. Conclusions   •  BowlognaBench  for  Analy5c  Queries   •  OWL  Ontology  for  Higher  Educa5on  Systems   •  Next  Steps   – Run  a  compara5ve  evalua5on  of  RDF  systems   – Set  up  a  wiki-­‐like  space  where  groups  can  upload   experimental  results   30-­‐Jun-­‐11   Gianluca  Demar5ni   15