Data quality is context-dependent. That is, the quality of data cannot
be assessed without contextual knowledge about the production or the use of
data. As expected, context-based data quality assessment requires a formal model
of context. Accordingly, we propose a model of context that addresses quality
concerns that are related to the production and use of data.
Here we follow and extend a context model for the assessment of the quality
of a database instance that was proposed in a previous work [1]. In that framework,
the context takes the form of a possibly virtual database or data integration
system into which a database instance under quality assessment is mapped, for
additional analysis and processing, enabling quality data extraction. In this work
we extend contexts with dimensions, and by doing so, we make possible a multidimensional
data quality assessment. Multidimensional contexts are represented
as ontologies written in Datalog±. We use this language for representing dimensional
constraints, and dimensional rules, and also for doing query answering
based on dimensional navigation, which becomes an important auxiliary activity
in the assessment of data.We show ideas and mechanisms by means of examples.
Doctoral Consortium@RuleML2015 -Multidimensional Ontologies for Contextual Quality Data Specification and Extraction
1. Multidimensional Ontologies for Contextual Quality Data
Specification and Extraction
Mostafa Milani
Supervisor: Prof. Leopoldo Bertossi
Carleton University
School of Computer Science
Ottawa, Canada
(Carleton University) Ontology-Based Multidimensional Contexts 1 / 15
2. Problem Statement Introduction
Multidimensional Contexts and Data Quality
Measurements table
contains the
temperatures of patients
at a hospital
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 2 / 15
3. Problem Statement Introduction
Multidimensional Contexts and Data Quality
Measurements table
contains the
temperatures of patients
at a hospital
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
A doctor suppose/expects the table to contain:
(Carleton University) Ontology-Based Multidimensional Contexts 2 / 15
4. Problem Statement Introduction
Multidimensional Contexts and Data Quality
Measurements table
contains the
temperatures of patients
at a hospital
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
A doctor suppose/expects the table to contain:
”The body temperatures of Tom Waits for September 5
taken around noon with a thermometer of brand B1”
(Carleton University) Ontology-Based Multidimensional Contexts 2 / 15
5. Problem Statement Introduction
Multidimensional Contexts and Data Quality
Measurements table
contains the
temperatures of patients
at a hospital
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
A doctor suppose/expects the table to contain:
”The body temperatures of Tom Waits for September 5
taken around noon with a thermometer of brand B1”
But Measurements does not contain the information to make this
assessment
(Carleton University) Ontology-Based Multidimensional Contexts 2 / 15
6. Problem Statement Introduction
Multidimensional Contexts and Data Quality
An external context can provide that information, making it possible
to assess the given data
(Carleton University) Ontology-Based Multidimensional Contexts 3 / 15
7. Problem Statement Introduction
Multidimensional Contexts and Data Quality
An external context can provide that information, making it possible
to assess the given data
Contex is modeled as relational databases (Bertossi et al., BIRTE 2010)
(Carleton University) Ontology-Based Multidimensional Contexts 3 / 15
8. Problem Statement Introduction
Multidimensional Contexts and Data Quality
An external context can provide that information, making it possible
to assess the given data
Contex is modeled as relational databases (Bertossi et al., BIRTE 2010)
The database under assessment is mapped into the contextual
database for further data quality analysis and cleaning
(Carleton University) Ontology-Based Multidimensional Contexts 3 / 15
9. Problem Statement Introduction
Multidimensional Contexts and Data Quality
An external context can provide that information, making it possible
to assess the given data
Contex is modeled as relational databases (Bertossi et al., BIRTE 2010)
The database under assessment is mapped into the contextual
database for further data quality analysis and cleaning
Context is commonly of a multi-dimensional nature
(Carleton University) Ontology-Based Multidimensional Contexts 3 / 15
10. Problem Statement Introduction
Multidimensional Contexts and Data Quality
An external context can provide that information, making it possible
to assess the given data
Contex is modeled as relational databases (Bertossi et al., BIRTE 2010)
The database under assessment is mapped into the contextual
database for further data quality analysis and cleaning
Context is commonly of a multi-dimensional nature
The dimensional aspects of context are not considered in
(Bertossi et al., BIRTE 2010)
(Carleton University) Ontology-Based Multidimensional Contexts 3 / 15
11. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
12. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
13. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
PatientWard: A table containing the location of patients
Hospital dimension: Represents the hierarchy of locations
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
14. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
PatientWard: A table containing the location of patients
Hospital dimension: Represents the hierarchy of locations
Information such as a hospital guideline:
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
15. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
PatientWard: A table containing the location of patients
Hospital dimension: Represents the hierarchy of locations
Information such as a hospital guideline:
”Temperature measurement for patients in standard care unit
have to be taken with thermometers of Brand B1”
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
16. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
PatientWard: A table containing the location of patients
Hospital dimension: Represents the hierarchy of locations
Information such as a hospital guideline:
”Temperature measurement for patients in standard care unit
have to be taken with thermometers of Brand B1”
Basis data model: HM model (Hurtado and Mendelzon, 2005)
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
17. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
We can see the context as an ontology, containing:
A MD data model/instance:
PatientWard: A table containing the location of patients
Hospital dimension: Represents the hierarchy of locations
Information such as a hospital guideline:
”Temperature measurement for patients in standard care unit
have to be taken with thermometers of Brand B1”
Basis data model: HM model (Hurtado and Mendelzon, 2005)
We extend the HM model (Maleki et al., AMW 2012)
(Carleton University) Ontology-Based Multidimensional Contexts 4 / 15
18. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
19. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
Dimensions as in the HM
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
20. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
Dimensions as in the HM
Categorical relations: Generalize fact tables, not necessarily numerical
values, linked to different levels of dimensions, possibly incomplete
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
21. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
Dimensions as in the HM
Categorical relations: Generalize fact tables, not necessarily numerical
values, linked to different levels of dimensions, possibly incomplete
Dimensional rules: Generate data where missing
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
22. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
Dimensions as in the HM
Categorical relations: Generalize fact tables, not necessarily numerical
values, linked to different levels of dimensions, possibly incomplete
Dimensional rules: Generate data where missing
Dimensional constraints: Constraints on (combinations of) categorical
relations, involve values from dimension categories)
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
23. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Informally, some of the new ingredients in MD contexts:
Dimensions as in the HM
Categorical relations: Generalize fact tables, not necessarily numerical
values, linked to different levels of dimensions, possibly incomplete
Dimensional rules: Generate data where missing
Dimensional constraints: Constraints on (combinations of) categorical
relations, involve values from dimension categories)
Dimensional rules and constraints can support and restrict
upward/downard navigation
(Carleton University) Ontology-Based Multidimensional Contexts 5 / 15
24. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Example
Ward and Unit:
categories of Hospital
dimension
(Carleton University) Ontology-Based Multidimensional Contexts 6 / 15
25. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Example
Ward and Unit:
categories of Hospital
dimension
UnitWard(unit,ward): a
parent/child relation
(Carleton University) Ontology-Based Multidimensional Contexts 6 / 15
26. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Example
Ward and Unit:
categories of Hospital
dimension
UnitWard(unit,ward): a
parent/child relation
PatientUnit
id Unit Day Patient
1 Standard Sep/5 Tom Waits
2 Standard Sep/6 Tom Waits
3 Intensive Sep/7 Tom Waits
4 Intensive Sep/6 Lou Reed
5 Standard Sep/5 Lou Reed
PatientWard
id Ward Day Patient
1 W1 Sep/5 Tom Waits
2 W1 Sep/6 Tom Waits
3 W3 Sep/7 Tom Waits
4 W3 Sep/6 Lou Reed
5 W2 Sep/5 Lou Reed
AllHospital
Institution
Unit
Ward
Standard Intensive Terminal
W1 W2 W3 W4
H1 H2
allHospital
AllTime
Year
Month
Day
Time
(Carleton University) Ontology-Based Multidimensional Contexts 6 / 15
27. Multidimensional Context Extended HM Data Model
Extending Context with Multidimensional Data
Example
Ward and Unit:
categories of Hospital
dimension
UnitWard(unit,ward): a
parent/child relation
PatientUnit
id Unit Day Patient
1 Standard Sep/5 Tom Waits
2 Standard Sep/6 Tom Waits
3 Intensive Sep/7 Tom Waits
4 Intensive Sep/6 Lou Reed
5 Standard Sep/5 Lou Reed
PatientWard
id Ward Day Patient
1 W1 Sep/5 Tom Waits
2 W1 Sep/6 Tom Waits
3 W3 Sep/7 Tom Waits
4 W3 Sep/6 Lou Reed
5 W2 Sep/5 Lou Reed
AllHospital
Institution
Unit
Ward
Standard Intensive Terminal
W1 W2 W3 W4
H1 H2
allHospital
AllTime
Year
Month
Day
Time
PatientWard: categorical relation with Ward and Day categorical
attributes taking values from dimension categories
(Carleton University) Ontology-Based Multidimensional Contexts 6 / 15
28. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
29. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
30. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
⊥ ← PatientUnit(u, d; p), ¬Unit(u)
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
31. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
⊥ ← PatientUnit(u, d; p), ¬Unit(u)
“All thermometers used in a unit are of the same type”:
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
32. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
⊥ ← PatientUnit(u, d; p), ¬Unit(u)
“All thermometers used in a unit are of the same type”:
t = t ← Thermometer(w, t; n), Thermometer(w , t ; n ),
UnitWard(u, w), UnitWard(u, w ) An EGD
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
33. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
⊥ ← PatientUnit(u, d; p), ¬Unit(u)
“All thermometers used in a unit are of the same type”:
t = t ← Thermometer(w, t; n), Thermometer(w , t ; n ),
UnitWard(u, w), UnitWard(u, w ) An EGD
“No patient in intensive care unit on August /2005”:
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
34. Multidimensional Context Extended HM Data Model
Dimensional Constraints
Example
Categorical relations are subject to dimensional constraints:
A referential constraint restricting units in PatientUnit
to elements in the Unit category, as a negative constraint:
⊥ ← PatientUnit(u, d; p), ¬Unit(u)
“All thermometers used in a unit are of the same type”:
t = t ← Thermometer(w, t; n), Thermometer(w , t ; n ),
UnitWard(u, w), UnitWard(u, w ) An EGD
“No patient in intensive care unit on August /2005”:
⊥ ← PatientWard(w, d; p), UnitWard(Intensive, w),
MonthDay(August/2005, d)
(Carleton University) Ontology-Based Multidimensional Contexts 7 / 15
35. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
36. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
PatientUnit(u, d; p) ← PatientWard(w, d; p),
UnitWard(u, w)
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
37. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
PatientUnit(u, d; p) ← PatientWard(w, d; p),
UnitWard(u, w)
Since relation schemas ”match”, ∃-variable in the head is not needed
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
38. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
PatientUnit(u, d; p) ← PatientWard(w, d; p),
UnitWard(u, w)
Since relation schemas ”match”, ∃-variable in the head is not needed
Rule is used to navigate from PatientWard.Ward upwards to
PatientUnit.Unit via UnitWard
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
39. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
PatientUnit(u, d; p) ← PatientWard(w, d; p),
UnitWard(u, w)
Since relation schemas ”match”, ∃-variable in the head is not needed
Rule is used to navigate from PatientWard.Ward upwards to
PatientUnit.Unit via UnitWard
Once at the level of Unit, it is possible to take advantage of a
guideline -in the form of a rule- stating that:
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
40. Multidimensional Context Extended HM Data Model
Dimensional Rules
Example
Data in PatientWard generate data about patients for
higher-level categorical relation PatientUnit:
PatientUnit(u, d; p) ← PatientWard(w, d; p),
UnitWard(u, w)
Since relation schemas ”match”, ∃-variable in the head is not needed
Rule is used to navigate from PatientWard.Ward upwards to
PatientUnit.Unit via UnitWard
Once at the level of Unit, it is possible to take advantage of a
guideline -in the form of a rule- stating that:
“Temperatures of patients in a standard care unit are taken with oral
thermometers”
(Carleton University) Ontology-Based Multidimensional Contexts 8 / 15
41. Multidimensional Context Ontological Representation of the Extended MD Model
Datalog± as Representation Language
We use Datalog± as our representation language (Cali et al., 2009)
(Carleton University) Ontology-Based Multidimensional Contexts 9 / 15
42. Multidimensional Context Ontological Representation of the Extended MD Model
Datalog± as Representation Language
We use Datalog± as our representation language (Cali et al., 2009)
An extension of Datalog for ontology building with efficient
access to underlying data sources
(Carleton University) Ontology-Based Multidimensional Contexts 9 / 15
43. Multidimensional Context Ontological Representation of the Extended MD Model
Datalog± as Representation Language
We use Datalog± as our representation language (Cali et al., 2009)
An extension of Datalog for ontology building with efficient
access to underlying data sources
A family of languages with different syntactic restrictions on rules to
guarantee decidability
(Carleton University) Ontology-Based Multidimensional Contexts 9 / 15
44. Multidimensional Context Ontological Representation of the Extended MD Model
Datalog± as Representation Language
We use Datalog± as our representation language (Cali et al., 2009)
An extension of Datalog for ontology building with efficient
access to underlying data sources
A family of languages with different syntactic restrictions on rules to
guarantee decidability
The chase (that forwards propagates data through rules) may not
terminate
(Carleton University) Ontology-Based Multidimensional Contexts 9 / 15
45. Multidimensional Context Ontological Representation of the Extended MD Model
Datalog± as Representation Language
We use Datalog± as our representation language (Cali et al., 2009)
An extension of Datalog for ontology building with efficient
access to underlying data sources
A family of languages with different syntactic restrictions on rules to
guarantee decidability
The chase (that forwards propagates data through rules) may not
terminate
Our MD contexts has the general forms of dimensional rules and
constraints captured by Datalog± TGDs, EGDs, and Negative
Constraints
(Carleton University) Ontology-Based Multidimensional Contexts 9 / 15
46. Multidimensional Context Ontological Representation of the Extended MD Model
Properties of MD Ontologies and Query Answering
Our Datalog± MD ontologies become weakly-sticky Datalog±
programs (Cali et al., 2012)
(Carleton University) Ontology-Based Multidimensional Contexts 10 / 15
47. Multidimensional Context Ontological Representation of the Extended MD Model
Properties of MD Ontologies and Query Answering
Our Datalog± MD ontologies become weakly-sticky Datalog±
programs (Cali et al., 2012)
It is crucial that repeated variables in TGDs are for categorical
attributes (a finite number of values can be taken by them)
(Carleton University) Ontology-Based Multidimensional Contexts 10 / 15
48. Multidimensional Context Ontological Representation of the Extended MD Model
Properties of MD Ontologies and Query Answering
Our Datalog± MD ontologies become weakly-sticky Datalog±
programs (Cali et al., 2012)
It is crucial that repeated variables in TGDs are for categorical
attributes (a finite number of values can be taken by them)
Weak-stickiness guarantees tractability of conjunctive query
answering (QA): only an initial portion of the chase has to be
inspected
(Carleton University) Ontology-Based Multidimensional Contexts 10 / 15
49. Multidimensional Context Ontological Representation of the Extended MD Model
Properties of MD Ontologies and Query Answering
Our Datalog± MD ontologies become weakly-sticky Datalog±
programs (Cali et al., 2012)
It is crucial that repeated variables in TGDs are for categorical
attributes (a finite number of values can be taken by them)
Weak-stickiness guarantees tractability of conjunctive query
answering (QA): only an initial portion of the chase has to be
inspected
A non-deterministic algorithm WeaklySticky-QAns for weakly-sticky
Datalog± (Cali et al., 2012)
(Carleton University) Ontology-Based Multidimensional Contexts 10 / 15
50. Multidimensional Context Ontological Representation of the Extended MD Model
Properties of MD Ontologies and Query Answering
Our Datalog± MD ontologies become weakly-sticky Datalog±
programs (Cali et al., 2012)
It is crucial that repeated variables in TGDs are for categorical
attributes (a finite number of values can be taken by them)
Weak-stickiness guarantees tractability of conjunctive query
answering (QA): only an initial portion of the chase has to be
inspected
A non-deterministic algorithm WeaklySticky-QAns for weakly-sticky
Datalog± (Cali et al., 2012)
We proposed a deterministic version of the algorithm for
weakly-sticky programs and studied optimization technique (Milani and
Bertossi, AMW 2015)
(Carleton University) Ontology-Based Multidimensional Contexts 10 / 15
51. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
52. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
53. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
54. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
Example
A dimensional rule in M:
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
55. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
Example
A dimensional rule in M:
PatientUnit(u, t; p) ← PatientWard(w, d; p), DayTime(d, t),
UnitWard(u, w)
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
56. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
Example
A dimensional rule in M:
PatientUnit(u, t; p) ← PatientWard(w, d; p), DayTime(d, t),
UnitWard(u, w)
A quality predicate:
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
57. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
Example
A dimensional rule in M:
PatientUnit(u, t; p) ← PatientWard(w, d; p), DayTime(d, t),
UnitWard(u, w)
A quality predicate:
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
58. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
The MD ontology M becomes part of the context for data quality
assessment
The original instance D of schema S is to be assessed or cleaned
through the context
By mapping D into the contextual schema/instance C
Example
A dimensional rule in M:
PatientUnit(u, t; p) ← PatientWard(w, d; p), DayTime(d, t),
UnitWard(u, w)
A quality predicate:
TakenWithTherm(t, p, b) ← PatientUnit(u, t; p), u = Standard, b = B1
(Carleton University) Ontology-Based Multidimensional Contexts 11 / 15
59. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Quality version Measurementsq
:
(Carleton University) Ontology-Based Multidimensional Contexts 12 / 15
60. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Quality version Measurementsq
:
Measurementsq
(t, p, v) ← Measurements (t, p, v),
TakenWithTherm(t, p, b), b = B1, y = certified
(Carleton University) Ontology-Based Multidimensional Contexts 12 / 15
61. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Quality version Measurementsq
:
Measurementsq
(t, p, v) ← Measurements (t, p, v),
TakenWithTherm(t, p, b), b = B1, y = certified
A doctor asks the body temperatures of Tom Waits for September 5
taken around noon:
(Carleton University) Ontology-Based Multidimensional Contexts 12 / 15
62. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Quality version Measurementsq
:
Measurementsq
(t, p, v) ← Measurements (t, p, v),
TakenWithTherm(t, p, b), b = B1, y = certified
A doctor asks the body temperatures of Tom Waits for September 5
taken around noon:
Q(t, v) : Measurements(t, Tom Waits, v) ∧ Sep5-11:45 ≤ t ≤ Sep5-12:15
(Carleton University) Ontology-Based Multidimensional Contexts 12 / 15
63. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Quality version Measurementsq
:
Measurementsq
(t, p, v) ← Measurements (t, p, v),
TakenWithTherm(t, p, b), b = B1, y = certified
A doctor asks the body temperatures of Tom Waits for September 5
taken around noon:
Q(t, v) : Measurements(t, Tom Waits, v) ∧ Sep5-11:45 ≤ t ≤ Sep5-12:15
He expects that the measurements are taken with a thermometer of
brand B1
(Carleton University) Ontology-Based Multidimensional Contexts 12 / 15
64. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
65. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
Qq
(t, v):Measurementsq
(t, Tom Waits, v)∧Sep5-11:45 ≤ t ≤ Sep5-12:15
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
66. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
Qq
(t, v):Measurementsq
(t, Tom Waits, v)∧Sep5-11:45 ≤ t ≤ Sep5-12:15
Applying the definition of quality versions:
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
67. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
Qq
(t, v):Measurementsq
(t, Tom Waits, v)∧Sep5-11:45 ≤ t ≤ Sep5-12:15
Applying the definition of quality versions:
QC
(t, v): Measurements (t, p, v) ∧ TakenWithTherm(t, p, B1) ∧
p = Tom Waits ∧ Sep/5-11:45 ≤ t ≤ Sep/5-12:15
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
68. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
Qq
(t, v):Measurementsq
(t, Tom Waits, v)∧Sep5-11:45 ≤ t ≤ Sep5-12:15
Applying the definition of quality versions:
QC
(t, v): Measurements (t, p, v) ∧ TakenWithTherm(t, p, B1) ∧
p = Tom Waits ∧ Sep/5-11:45 ≤ t ≤ Sep/5-12:15
Unfolding the definition of quality predicates in P:
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
69. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Replacing predicates of S in Q with their quality versions in Sq:
Qq
(t, v):Measurementsq
(t, Tom Waits, v)∧Sep5-11:45 ≤ t ≤ Sep5-12:15
Applying the definition of quality versions:
QC
(t, v): Measurements (t, p, v) ∧ TakenWithTherm(t, p, B1) ∧
p = Tom Waits ∧ Sep/5-11:45 ≤ t ≤ Sep/5-12:15
Unfolding the definition of quality predicates in P:
QM
(t, v):Measurements (t, p, v) ∧ PatientUnit(u, t; p) ∧ u =Standard ∧
p = Tom Waits ∧ Sep/5-11:45 ≤ t ≤ Sep/5-12:15
(Carleton University) Ontology-Based Multidimensional Contexts 13 / 15
70. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
71. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
72. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
73. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
The first second and last
measurements have the
expected quality
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
74. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
The first second and last
measurements have the
expected quality
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
75. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
The first second and last
measurements have the
expected quality
The first measurement is a
clean answer to Q:
t = Sep/5-12:10 and v=38.2
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
76. Multidimensional Context MD Context for Quality Data Assessment
MD Contexts and Quality Query Answering: The Gist
Example
Measurements has the same extension of Measurements
PatientUnit is computed by QA on M
The first second and last
measurements have the
expected quality
The first measurement is a
clean answer to Q:
t = Sep/5-12:10 and v=38.2
Measurements
Time Patient Value
Sep/5-12:10 Tom Waits 38.2
Sep/6-11:50 Tom Waits 37.1
Sep/7-12:15 Tom Waits 37.7
Sep/9-12:00 Tom Waits 37.0
Sep/6-11:05 Lou Reed 37.5
Sep/5-12:05 Lou Reed 38.0
(Carleton University) Ontology-Based Multidimensional Contexts 14 / 15
78. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15
79. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
Development, implementation of the query answering algorithms is
ongoing work
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15
80. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
Development, implementation of the query answering algorithms is
ongoing work
Several extensions:
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15
81. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
Development, implementation of the query answering algorithms is
ongoing work
Several extensions:
Uncertain downward-navigation in dimensional rules
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15
82. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
Development, implementation of the query answering algorithms is
ongoing work
Several extensions:
Uncertain downward-navigation in dimensional rules
Checking dimensional constraints not only on the result of the chase
but while data generation
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15
83. Conclusions
Conclusions
Multidimensional contexts are represented as Datalog± ontologies
They allow us to specify data quality conditions, and to retrieve
quality data
Development, implementation of the query answering algorithms is
ongoing work
Several extensions:
Uncertain downward-navigation in dimensional rules
Checking dimensional constraints not only on the result of the chase
but while data generation
Relaxing the assumption of complete categorical data, and studying its
effect on dimensions
(Carleton University) Ontology-Based Multidimensional Contexts 15 / 15