SlideShare a Scribd company logo
1 of 71
Download to read offline
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
A Method for Filtering
Large Conceptual Schemas
Antonio Villegas and Antoni Oliv´e
{avillegas, olive}@essi.upc.edu
Services and Information Systems Engineering Department
Universitat Polit`ecnica de Catalunya
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 1 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Outline
1 Introduction
2 Filtering Method
3 Filtered Conceptual Schema
4 Experimentation
5 Conclusions
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 2 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
osCommerce
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 4 / 28
84 Entity types, 209 Attributes, 183 Relationship types,
28 IsA Relationships, 204 general constraints and derivation rules
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
Health Level 7
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 5 / 28
2,695 Entity types, 160 Attributes,
228 Relationship types, 2,934 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
ResearchCyc
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 6 / 28
26,725 Entity types, 1,060 Attributes,
5,514 Relationship types, 43,323 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
Usability Problem
The largeness of conceptual schemas makes it difficult for a
user to get the knowledge of interest to her
This task needs computer support
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
Usability Problem
The largeness of conceptual schemas makes it difficult for a
user to get the knowledge of interest to her
This task needs computer support
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
Usability Problem
The largeness of conceptual schemas makes it difficult for a
user to get the knowledge of interest to her
This task needs computer support
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conceptual Schemas
Usability Problem
The largeness of conceptual schemas makes it difficult for a
user to get the knowledge of interest to her
This task needs computer support
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
General Overview
Filtering Method Overview
The main idea is to extract a reduced and self-contained view
from the large schema, that is, a filtered conceptual schema
with the knowledge of interest to the user.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 8 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Concrete Overview
Filtering Method Overview
- A user focuses on one o more entity types of interest
- The method filters the large schema obtaining a subset of
relevant elements to the user, taking into account:
the importance of each entity type in the whole schema,
and its closeness to the entity types in the user focus.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 9 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
The information need of a user looking for (a subset of) the
knowledge represented in a large schema includes:
Focus Set What the user is interested in about the schema
Rejection Set What the user is not interested in about the
schema
Filter Size How much knowledge the user wants to obtain
from the schema at a given moment
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
The information need of a user looking for (a subset of) the
knowledge represented in a large schema includes:
Focus Set What the user is interested in about the schema
Rejection Set What the user is not interested in about the
schema
Filter Size How much knowledge the user wants to obtain
from the schema at a given moment
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
The information need of a user looking for (a subset of) the
knowledge represented in a large schema includes:
Focus Set What the user is interested in about the schema
Rejection Set What the user is not interested in about the
schema
Filter Size How much knowledge the user wants to obtain
from the schema at a given moment
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
The information need of a user looking for (a subset of) the
knowledge represented in a large schema includes:
Focus Set What the user is interested in about the schema
Rejection Set What the user is not interested in about the
schema
Filter Size How much knowledge the user wants to obtain
from the schema at a given moment
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
Example of Knowledge Request
The Knowledge Request is based on the concept of Entity Type.
The user wants to know information about taxes in the
osCommerce schema:
Focus Set FS = {TaxRate, TaxClass}
Rejection Set RS = {Language}
Filter Size K = 10
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
Example of Knowledge Request
The Knowledge Request is based on the concept of Entity Type.
The user wants to know information about taxes in the
osCommerce schema:
Focus Set FS = {TaxRate, TaxClass}
Rejection Set RS = {Language}
Filter Size K = 10
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
Example of Knowledge Request
The Knowledge Request is based on the concept of Entity Type.
The user wants to know information about taxes in the
osCommerce schema:
Focus Set FS = {TaxRate, TaxClass}
Rejection Set RS = {Language}
Filter Size K = 10
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Representing the User Request
Example of Knowledge Request
The Knowledge Request is based on the concept of Entity Type.
The user wants to know information about taxes in the
osCommerce schema:
Focus Set FS = {TaxRate, TaxClass}
Rejection Set RS = {Language}
Filter Size K = 10
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28
Knowledge Request
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Closeness of Entity Types (Ω(e, FS))
Our method uses the closeness Ω(e, FS) between each candidate entity type e
in the schema with respect to the entity types of the focus set FS.
We say that e is a candidate entity type if e /∈ FS ∪ RS.
Intuitively, the closeness of candidate e should be directly related to the
inverse of the distance of e to the focus set FS,
Ω(e, FS) =
|FS|
X
e ∈FS
d(e, e )
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 12 / 28
Closeness
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Closeness of Entity Types (Ω(e, FS)) Example Ω(C, FS = {A, B})
Ω(e, FS) =
|FS|
X
e ∈FS
d(e, e )
Ω(C, FS) =
|{A, B}|
d(C, A) + d(C, B)
Ω(C, FS) =
2
3 + 2
= 0.4
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 13 / 28
Closeness
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Closeness of Entity Types (Ω(e, FS)) Example Ω(C, FS = {A, B})
Ω(e, FS) =
|FS|
X
e ∈FS
d(e, e )
Ω(C, FS) =
|{A, B}|
d(C, A) + d(C, B)
Ω(C, FS) =
2
3 + 2
= 0.4
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 13 / 28
Closeness
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real
number that measures the relevance of e in the schema. There are several
methods:
Occurrence counting Ψ(e) depends on the number of characteristics the
schema has about e
Link analysis Ψ(e) depends on the importance of those entity types
connected to e
Instance-dependent Ψ(e) depends on the instances of e
Our filtering method can be used in connection with any of the existing
importance-computing methods.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real
number that measures the relevance of e in the schema. There are several
methods:
Occurrence counting Ψ(e) depends on the number of characteristics the
schema has about e
Link analysis Ψ(e) depends on the importance of those entity types
connected to e
Instance-dependent Ψ(e) depends on the instances of e
Our filtering method can be used in connection with any of the existing
importance-computing methods.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real
number that measures the relevance of e in the schema. There are several
methods:
Occurrence counting Ψ(e) depends on the number of characteristics the
schema has about e
Link analysis Ψ(e) depends on the importance of those entity types
connected to e
Instance-dependent Ψ(e) depends on the instances of e
Our filtering method can be used in connection with any of the existing
importance-computing methods.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real
number that measures the relevance of e in the schema. There are several
methods:
Occurrence counting Ψ(e) depends on the number of characteristics the
schema has about e
Link analysis Ψ(e) depends on the importance of those entity types
connected to e
Instance-dependent Ψ(e) depends on the instances of e
Our filtering method can be used in connection with any of the existing
importance-computing methods.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real
number that measures the relevance of e in the schema. There are several
methods:
Occurrence counting Ψ(e) depends on the number of characteristics the
schema has about e
Link analysis Ψ(e) depends on the importance of those entity types
connected to e
Instance-dependent Ψ(e) depends on the instances of e
Our filtering method can be used in connection with any of the existing
importance-computing methods.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ)
osCommerce Ψ(e)
Product 1
Order 0.63
Language 0.53
Customer 0.39
Store 0.35
OrderLine 0.34
Zone 0.34
TaxZone 0.29
Special 0.28
Country 0.26
HL7 Ψ(e)
Act 1
Role 0.68
ActRelationship 0.53
Participation 0.49
Entity 0.46
Observation 0.35
InfrastructureRoot 0.24
Organization 0.23
RoleLink 0.21
FinancialTransaction 0.2
ResearchCyc Ψ(e)
Individual 1
LexicalWord 0.13
PartiallyTangible 0.12
Thing 0.07
SpatialThing 0.06
Agent 0.05
Organization 0.05
SomethingExisting 0.05
TemporalThing 0.04
HumanActivity 0.04
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 15 / 28
Importance
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS))
Interest of Entity Types (Φ(e, FS))
Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28
Interest
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS))
Interest of Entity Types (Φ(e, FS))
Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28
Interest
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS))
Interest of Entity Types (Φ(e, FS))
Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28
Interest
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Interest of Entity Types (Φ(e, FS))
Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Importance Ψ
ClosenessΩ
r
r
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 17 / 28
Interest
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtering Measures
Interest of Entity Types (Φ(e, FS))
Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 P1
P2
P3P4
P5
P6P7
P8
Importance Ψ
ClosenessΩ
r
dist(P6,r)
dist(P2,r)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 17 / 28
Interest
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Conceptual Schema (FCS)
Main Task
Construct a filtered conceptual schema, FCS, from the K more
interesting entity types and the knowledge of the original schema
CS.
FCS Components
- EF is a set of entity types filtered from E of CS
- RF is a set of relationship types filtered from R of CS
- IF is a set of IsA relationships filtered from I of CS.
- CF is a set of integrity constraints filtered from C of CS.
- DF is a set of derivation rules filtered from D of CS
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 18 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
FS
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
Etop
FS
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
EauxEtop
FS
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
EauxEtop
FS
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Entity Types (EF)
FCS = EF , RF , IF , CF , DF
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28
FS: focus set
Etop: most interesting entity types
Eaux : auxiliary entity types
Filter Size K = |FS| + |Etop|
|EF | = K + |Eaux |
EauxEtop
FS
EF
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Relationship Types (RF)
FCS = EF , RF , IF , CF , DF
The relationship types in RF are those r of the original schema
whose participant entity types
belong to EF
or are ascendants of entity types of EF
(in which case a projection of r is required)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Relationship Types (RF)
FCS = EF , RF , IF , CF , DF
The relationship types in RF are those r of the original schema
whose participant entity types
belong to EF
or are ascendants of entity types of EF
(in which case a projection of r is required)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Relationship Types (RF)
FCS = EF , RF , IF , CF , DF
The relationship types in RF are those r of the original schema
whose participant entity types
belong to EF
or are ascendants of entity types of EF
(in which case a projection of r is required)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Relationship Types (RF)
FCS = EF , RF , IF , CF , DF
The relationship types in RF are those r of the original schema
whose participant entity types
belong to EF
or are ascendants of entity types of EF
(in which case a projection of r is required)
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
← included in Eaux
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered IsA Relationships (IF)
FCS = EF , RF , IF , CF , DF
If e and e are entity types in EF and there is a direct or indirect
IsA relationship between them in the original schema, then such
IsA relationship must also exist in IF of FCS
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 21 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered IsA Relationships (IF)
FCS = EF , RF , IF , CF , DF
If e and e are entity types in EF and there is a direct or indirect
IsA relationship between them in the original schema, then such
IsA relationship must also exist in IF of FCS
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 21 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Filtered Conceptual Schema
Filtered Integrity Constraints (CF ) and Derivation Rules (DF )
FCS = EF , RF , IF , CF , DF
The integrity constraints and derivation rules included in CF and
DF of FCS are those whose expressions only involve entity types
from EF .
Both the integrity constraint ic1 and the derivation rule dr1 are
only included in CF and DF of FCS if A, B ∈ EF .
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 22 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
osCommerce
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28
84 Entity types, 209 Attributes, 183 Relationship types,
28 IsA Relationships, 204 general constraints and derivation rules
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
osCommerce
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28
FS = {TaxRate, TaxClass}
and K = 10
84 Entity types, 209 Attributes, 183 Relationship types,
28 IsA Relationships, 204 general constraints and derivation rules
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
osCommerce
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28
FS = {TaxRate, TaxClass}
and K = 10
84 Entity types, 209 Attributes, 183 Relationship types,
28 IsA Relationships, 204 general constraints and derivation rules
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
Health Level 7
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28
2,695 Entity types, 160 Attributes,
228 Relationship types, 2,934 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
Health Level 7
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28
FS = {Patient, ActAppointment}
and K = 10
2,695 Entity types, 160 Attributes,
228 Relationship types, 2,934 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
Health Level 7
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28
FS = {Patient, ActAppointment}
and K = 10
2,695 Entity types, 160 Attributes,
228 Relationship types, 2,934 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
ResearchCyc
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28
26,725 Entity types, 1,060 Attributes,
5,514 Relationship types, 43,323 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
ResearchCyc
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28
FS = {Cancer}
and K = 18
26,725 Entity types, 1,060 Attributes,
5,514 Relationship types, 43,323 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Examples
ResearchCyc
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28
FS = {Cancer}
and K = 18
26,725 Entity types, 1,060 Attributes,
5,514 Relationship types, 43,323 IsA Relationships
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conclusions
The problem of filtering a fragment of the knowledge
contained in a large conceptual schema appears in many
information systems development activities.
People needs to operate for some purpose with a fragment of
the knowledge contained in those large schemas.
We have proposed a filtering method based on the
importance and closeness measures. A user indicates a
focus set of entity types, and the method determines a
subset of the elements of the large schema that is likely to
be of interest to the user.
We have experimented with three large schemas. In both
cases, our prototype obtains the filtered schema in less than
a second.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conclusions
The problem of filtering a fragment of the knowledge
contained in a large conceptual schema appears in many
information systems development activities.
People needs to operate for some purpose with a fragment of
the knowledge contained in those large schemas.
We have proposed a filtering method based on the
importance and closeness measures. A user indicates a
focus set of entity types, and the method determines a
subset of the elements of the large schema that is likely to
be of interest to the user.
We have experimented with three large schemas. In both
cases, our prototype obtains the filtered schema in less than
a second.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conclusions
The problem of filtering a fragment of the knowledge
contained in a large conceptual schema appears in many
information systems development activities.
People needs to operate for some purpose with a fragment of
the knowledge contained in those large schemas.
We have proposed a filtering method based on the
importance and closeness measures. A user indicates a
focus set of entity types, and the method determines a
subset of the elements of the large schema that is likely to
be of interest to the user.
We have experimented with three large schemas. In both
cases, our prototype obtains the filtered schema in less than
a second.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Conclusions
The problem of filtering a fragment of the knowledge
contained in a large conceptual schema appears in many
information systems development activities.
People needs to operate for some purpose with a fragment of
the knowledge contained in those large schemas.
We have proposed a filtering method based on the
importance and closeness measures. A user indicates a
focus set of entity types, and the method determines a
subset of the elements of the large schema that is likely to
be of interest to the user.
We have experimented with three large schemas. In both
cases, our prototype obtains the filtered schema in less than
a second.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Further Work
Take into account the importance of relationship types.
Fine-grained filtering of integrity constraints and
derivation rules.
Conduct experiments to precisely determine the usefulness of
our method to real users.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Further Work
Take into account the importance of relationship types.
Fine-grained filtering of integrity constraints and
derivation rules.
Conduct experiments to precisely determine the usefulness of
our method to real users.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
Further Work
Take into account the importance of relationship types.
Fine-grained filtering of integrity constraints and
derivation rules.
Conduct experiments to precisely determine the usefulness of
our method to real users.
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions
A Method for Filtering
Large Conceptual Schemas
Antonio Villegas and Antoni Oliv´e
{avillegas, olive}@essi.upc.edu
Services and Information Systems Engineering Department
Universitat Polit`ecnica de Catalunya
A. Villegas and A. Oliv´e ER 2010 November 3, 2010 28 / 28

More Related Content

Viewers also liked

The what, why, and how of master data management
The what, why, and how of master data managementThe what, why, and how of master data management
The what, why, and how of master data managementMohammad Yousri
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data ManagementSung Kuan
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...Christopher Bradley
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmapvictorlbrown
 

Viewers also liked (6)

The what, why, and how of master data management
The what, why, and how of master data managementThe what, why, and how of master data management
The what, why, and how of master data management
 
Ebook - The Guide to Master Data Management
Ebook - The Guide to Master Data Management Ebook - The Guide to Master Data Management
Ebook - The Guide to Master Data Management
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management Functionality
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
 

Similar to A method for filtering large conceptual schemas

ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
ABSTAT: Ontology-driven Linked Data Summaries with Pattern MinimalizationABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
ABSTAT: Ontology-driven Linked Data Summaries with Pattern MinimalizationBlerina Spahiu
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
Advantages And Disadvantages Of Chronic Kidney Disease
Advantages And Disadvantages Of Chronic Kidney DiseaseAdvantages And Disadvantages Of Chronic Kidney Disease
Advantages And Disadvantages Of Chronic Kidney DiseaseKaren Oliver
 
Lec1-Into
Lec1-IntoLec1-Into
Lec1-Intobutest
 
Empirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningEmpirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningIRJET Journal
 
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...Xiaoyu Wang
 
An Ontology Model for Knowledge Representation over User Profiles
An Ontology Model for Knowledge Representation over User ProfilesAn Ontology Model for Knowledge Representation over User Profiles
An Ontology Model for Knowledge Representation over User ProfilesIJMER
 
Abstract writingprocess
Abstract writingprocessAbstract writingprocess
Abstract writingprocesspandubcool
 
Research methods
Research methodsResearch methods
Research methodsA M
 
'Connecting poeple to resources' by Nicole Harris at UKSG 2007
'Connecting poeple to resources' by Nicole Harris at UKSG 2007'Connecting poeple to resources' by Nicole Harris at UKSG 2007
'Connecting poeple to resources' by Nicole Harris at UKSG 2007JISC.AM
 
Inductive Classification through Evidence-based Models and Their Ensemble
Inductive Classification through Evidence-based Models and Their EnsembleInductive Classification through Evidence-based Models and Their Ensemble
Inductive Classification through Evidence-based Models and Their EnsembleGiuseppe Rizzo
 
Using data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningUsing data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningeSAT Journals
 
Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...ijeei-iaes
 
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...ijcseit
 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesIAEME Publication
 
Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Sung Eob Lee
 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachIRJET Journal
 
Using data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningUsing data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningeSAT Publishing House
 

Similar to A method for filtering large conceptual schemas (20)

ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
ABSTAT: Ontology-driven Linked Data Summaries with Pattern MinimalizationABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
 
Advantages And Disadvantages Of Chronic Kidney Disease
Advantages And Disadvantages Of Chronic Kidney DiseaseAdvantages And Disadvantages Of Chronic Kidney Disease
Advantages And Disadvantages Of Chronic Kidney Disease
 
Lec1-Into
Lec1-IntoLec1-Into
Lec1-Into
 
Empirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningEmpirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion Mining
 
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...
 
An Ontology Model for Knowledge Representation over User Profiles
An Ontology Model for Knowledge Representation over User ProfilesAn Ontology Model for Knowledge Representation over User Profiles
An Ontology Model for Knowledge Representation over User Profiles
 
Abstract writingprocess
Abstract writingprocessAbstract writingprocess
Abstract writingprocess
 
Research methods
Research methodsResearch methods
Research methods
 
'Connecting poeple to resources' by Nicole Harris at UKSG 2007
'Connecting poeple to resources' by Nicole Harris at UKSG 2007'Connecting poeple to resources' by Nicole Harris at UKSG 2007
'Connecting poeple to resources' by Nicole Harris at UKSG 2007
 
Inductive Classification through Evidence-based Models and Their Ensemble
Inductive Classification through Evidence-based Models and Their EnsembleInductive Classification through Evidence-based Models and Their Ensemble
Inductive Classification through Evidence-based Models and Their Ensemble
 
Using data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningUsing data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text mining
 
Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...
 
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniques
 
Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03Lee Sung Eob Mastersthesisproposal03
Lee Sung Eob Mastersthesisproposal03
 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering Approach
 
Using data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text miningUsing data mining methods knowledge discovery for text mining
Using data mining methods knowledge discovery for text mining
 

A method for filtering large conceptual schemas

  • 1. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions A Method for Filtering Large Conceptual Schemas Antonio Villegas and Antoni Oliv´e {avillegas, olive}@essi.upc.edu Services and Information Systems Engineering Department Universitat Polit`ecnica de Catalunya A. Villegas and A. Oliv´e ER 2010 November 3, 2010 1 / 28
  • 2. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Outline 1 Introduction 2 Filtering Method 3 Filtered Conceptual Schema 4 Experimentation 5 Conclusions A. Villegas and A. Oliv´e ER 2010 November 3, 2010 2 / 28
  • 3. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 4. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 5. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 6. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 7. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 8. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 9. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas A. Villegas and A. Oliv´e ER 2010 November 3, 2010 3 / 28
  • 10. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas osCommerce A. Villegas and A. Oliv´e ER 2010 November 3, 2010 4 / 28 84 Entity types, 209 Attributes, 183 Relationship types, 28 IsA Relationships, 204 general constraints and derivation rules
  • 11. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas Health Level 7 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 5 / 28 2,695 Entity types, 160 Attributes, 228 Relationship types, 2,934 IsA Relationships
  • 12. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas ResearchCyc A. Villegas and A. Oliv´e ER 2010 November 3, 2010 6 / 28 26,725 Entity types, 1,060 Attributes, 5,514 Relationship types, 43,323 IsA Relationships
  • 13. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas Usability Problem The largeness of conceptual schemas makes it difficult for a user to get the knowledge of interest to her This task needs computer support A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
  • 14. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas Usability Problem The largeness of conceptual schemas makes it difficult for a user to get the knowledge of interest to her This task needs computer support A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
  • 15. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas Usability Problem The largeness of conceptual schemas makes it difficult for a user to get the knowledge of interest to her This task needs computer support A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
  • 16. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conceptual Schemas Usability Problem The largeness of conceptual schemas makes it difficult for a user to get the knowledge of interest to her This task needs computer support A. Villegas and A. Oliv´e ER 2010 November 3, 2010 7 / 28
  • 17. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions General Overview Filtering Method Overview The main idea is to extract a reduced and self-contained view from the large schema, that is, a filtered conceptual schema with the knowledge of interest to the user. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 8 / 28
  • 18. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Concrete Overview Filtering Method Overview - A user focuses on one o more entity types of interest - The method filters the large schema obtaining a subset of relevant elements to the user, taking into account: the importance of each entity type in the whole schema, and its closeness to the entity types in the user focus. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 9 / 28
  • 19. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request The information need of a user looking for (a subset of) the knowledge represented in a large schema includes: Focus Set What the user is interested in about the schema Rejection Set What the user is not interested in about the schema Filter Size How much knowledge the user wants to obtain from the schema at a given moment A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28 Knowledge Request
  • 20. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request The information need of a user looking for (a subset of) the knowledge represented in a large schema includes: Focus Set What the user is interested in about the schema Rejection Set What the user is not interested in about the schema Filter Size How much knowledge the user wants to obtain from the schema at a given moment A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28 Knowledge Request
  • 21. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request The information need of a user looking for (a subset of) the knowledge represented in a large schema includes: Focus Set What the user is interested in about the schema Rejection Set What the user is not interested in about the schema Filter Size How much knowledge the user wants to obtain from the schema at a given moment A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28 Knowledge Request
  • 22. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request The information need of a user looking for (a subset of) the knowledge represented in a large schema includes: Focus Set What the user is interested in about the schema Rejection Set What the user is not interested in about the schema Filter Size How much knowledge the user wants to obtain from the schema at a given moment A. Villegas and A. Oliv´e ER 2010 November 3, 2010 10 / 28 Knowledge Request
  • 23. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request Example of Knowledge Request The Knowledge Request is based on the concept of Entity Type. The user wants to know information about taxes in the osCommerce schema: Focus Set FS = {TaxRate, TaxClass} Rejection Set RS = {Language} Filter Size K = 10 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28 Knowledge Request
  • 24. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request Example of Knowledge Request The Knowledge Request is based on the concept of Entity Type. The user wants to know information about taxes in the osCommerce schema: Focus Set FS = {TaxRate, TaxClass} Rejection Set RS = {Language} Filter Size K = 10 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28 Knowledge Request
  • 25. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request Example of Knowledge Request The Knowledge Request is based on the concept of Entity Type. The user wants to know information about taxes in the osCommerce schema: Focus Set FS = {TaxRate, TaxClass} Rejection Set RS = {Language} Filter Size K = 10 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28 Knowledge Request
  • 26. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Representing the User Request Example of Knowledge Request The Knowledge Request is based on the concept of Entity Type. The user wants to know information about taxes in the osCommerce schema: Focus Set FS = {TaxRate, TaxClass} Rejection Set RS = {Language} Filter Size K = 10 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 11 / 28 Knowledge Request
  • 27. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Closeness of Entity Types (Ω(e, FS)) Our method uses the closeness Ω(e, FS) between each candidate entity type e in the schema with respect to the entity types of the focus set FS. We say that e is a candidate entity type if e /∈ FS ∪ RS. Intuitively, the closeness of candidate e should be directly related to the inverse of the distance of e to the focus set FS, Ω(e, FS) = |FS| X e ∈FS d(e, e ) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 12 / 28 Closeness
  • 28. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Closeness of Entity Types (Ω(e, FS)) Example Ω(C, FS = {A, B}) Ω(e, FS) = |FS| X e ∈FS d(e, e ) Ω(C, FS) = |{A, B}| d(C, A) + d(C, B) Ω(C, FS) = 2 3 + 2 = 0.4 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 13 / 28 Closeness
  • 29. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Closeness of Entity Types (Ω(e, FS)) Example Ω(C, FS = {A, B}) Ω(e, FS) = |FS| X e ∈FS d(e, e ) Ω(C, FS) = |{A, B}| d(C, A) + d(C, B) Ω(C, FS) = 2 3 + 2 = 0.4 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 13 / 28 Closeness
  • 30. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real number that measures the relevance of e in the schema. There are several methods: Occurrence counting Ψ(e) depends on the number of characteristics the schema has about e Link analysis Ψ(e) depends on the importance of those entity types connected to e Instance-dependent Ψ(e) depends on the instances of e Our filtering method can be used in connection with any of the existing importance-computing methods. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28 Importance
  • 31. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real number that measures the relevance of e in the schema. There are several methods: Occurrence counting Ψ(e) depends on the number of characteristics the schema has about e Link analysis Ψ(e) depends on the importance of those entity types connected to e Instance-dependent Ψ(e) depends on the instances of e Our filtering method can be used in connection with any of the existing importance-computing methods. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28 Importance
  • 32. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real number that measures the relevance of e in the schema. There are several methods: Occurrence counting Ψ(e) depends on the number of characteristics the schema has about e Link analysis Ψ(e) depends on the importance of those entity types connected to e Instance-dependent Ψ(e) depends on the instances of e Our filtering method can be used in connection with any of the existing importance-computing methods. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28 Importance
  • 33. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real number that measures the relevance of e in the schema. There are several methods: Occurrence counting Ψ(e) depends on the number of characteristics the schema has about e Link analysis Ψ(e) depends on the importance of those entity types connected to e Instance-dependent Ψ(e) depends on the instances of e Our filtering method can be used in connection with any of the existing importance-computing methods. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28 Importance
  • 34. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) The importance Ψ(e) of an entity type e ∈ E of a conceptual schema is a real number that measures the relevance of e in the schema. There are several methods: Occurrence counting Ψ(e) depends on the number of characteristics the schema has about e Link analysis Ψ(e) depends on the importance of those entity types connected to e Instance-dependent Ψ(e) depends on the instances of e Our filtering method can be used in connection with any of the existing importance-computing methods. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 14 / 28 Importance
  • 35. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ) osCommerce Ψ(e) Product 1 Order 0.63 Language 0.53 Customer 0.39 Store 0.35 OrderLine 0.34 Zone 0.34 TaxZone 0.29 Special 0.28 Country 0.26 HL7 Ψ(e) Act 1 Role 0.68 ActRelationship 0.53 Participation 0.49 Entity 0.46 Observation 0.35 InfrastructureRoot 0.24 Organization 0.23 RoleLink 0.21 FinancialTransaction 0.2 ResearchCyc Ψ(e) Individual 1 LexicalWord 0.13 PartiallyTangible 0.12 Thing 0.07 SpatialThing 0.06 Agent 0.05 Organization 0.05 SomethingExisting 0.05 TemporalThing 0.04 HumanActivity 0.04 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 15 / 28 Importance
  • 36. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS)) Interest of Entity Types (Φ(e, FS)) Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28 Interest
  • 37. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS)) Interest of Entity Types (Φ(e, FS)) Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28 Interest
  • 38. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Importance of Entity Types (Ψ(e)) Closeness of Entity Types (Ω(e, FS)) Interest of Entity Types (Φ(e, FS)) Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 16 / 28 Interest
  • 39. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Interest of Entity Types (Φ(e, FS)) Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Importance Ψ ClosenessΩ r r A. Villegas and A. Oliv´e ER 2010 November 3, 2010 17 / 28 Interest
  • 40. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtering Measures Interest of Entity Types (Φ(e, FS)) Φ(e, FS) = α × Ψ(e) + (1 − α) × Ω(e, FS) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 P1 P2 P3P4 P5 P6P7 P8 Importance Ψ ClosenessΩ r dist(P6,r) dist(P2,r) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 17 / 28 Interest
  • 41. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Conceptual Schema (FCS) Main Task Construct a filtered conceptual schema, FCS, from the K more interesting entity types and the knowledge of the original schema CS. FCS Components - EF is a set of entity types filtered from E of CS - RF is a set of relationship types filtered from R of CS - IF is a set of IsA relationships filtered from I of CS. - CF is a set of integrity constraints filtered from C of CS. - DF is a set of derivation rules filtered from D of CS A. Villegas and A. Oliv´e ER 2010 November 3, 2010 18 / 28
  • 42. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | EF
  • 43. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | FS EF
  • 44. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | Etop FS EF
  • 45. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | EauxEtop FS EF
  • 46. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | EauxEtop FS EF
  • 47. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Entity Types (EF) FCS = EF , RF , IF , CF , DF A. Villegas and A. Oliv´e ER 2010 November 3, 2010 19 / 28 FS: focus set Etop: most interesting entity types Eaux : auxiliary entity types Filter Size K = |FS| + |Etop| |EF | = K + |Eaux | EauxEtop FS EF
  • 48. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Relationship Types (RF) FCS = EF , RF , IF , CF , DF The relationship types in RF are those r of the original schema whose participant entity types belong to EF or are ascendants of entity types of EF (in which case a projection of r is required) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
  • 49. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Relationship Types (RF) FCS = EF , RF , IF , CF , DF The relationship types in RF are those r of the original schema whose participant entity types belong to EF or are ascendants of entity types of EF (in which case a projection of r is required) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
  • 50. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Relationship Types (RF) FCS = EF , RF , IF , CF , DF The relationship types in RF are those r of the original schema whose participant entity types belong to EF or are ascendants of entity types of EF (in which case a projection of r is required) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28
  • 51. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Relationship Types (RF) FCS = EF , RF , IF , CF , DF The relationship types in RF are those r of the original schema whose participant entity types belong to EF or are ascendants of entity types of EF (in which case a projection of r is required) A. Villegas and A. Oliv´e ER 2010 November 3, 2010 20 / 28 ← included in Eaux
  • 52. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered IsA Relationships (IF) FCS = EF , RF , IF , CF , DF If e and e are entity types in EF and there is a direct or indirect IsA relationship between them in the original schema, then such IsA relationship must also exist in IF of FCS A. Villegas and A. Oliv´e ER 2010 November 3, 2010 21 / 28
  • 53. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered IsA Relationships (IF) FCS = EF , RF , IF , CF , DF If e and e are entity types in EF and there is a direct or indirect IsA relationship between them in the original schema, then such IsA relationship must also exist in IF of FCS A. Villegas and A. Oliv´e ER 2010 November 3, 2010 21 / 28
  • 54. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Filtered Conceptual Schema Filtered Integrity Constraints (CF ) and Derivation Rules (DF ) FCS = EF , RF , IF , CF , DF The integrity constraints and derivation rules included in CF and DF of FCS are those whose expressions only involve entity types from EF . Both the integrity constraint ic1 and the derivation rule dr1 are only included in CF and DF of FCS if A, B ∈ EF . A. Villegas and A. Oliv´e ER 2010 November 3, 2010 22 / 28
  • 55. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples osCommerce A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28 84 Entity types, 209 Attributes, 183 Relationship types, 28 IsA Relationships, 204 general constraints and derivation rules
  • 56. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples osCommerce A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28 FS = {TaxRate, TaxClass} and K = 10 84 Entity types, 209 Attributes, 183 Relationship types, 28 IsA Relationships, 204 general constraints and derivation rules
  • 57. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples osCommerce A. Villegas and A. Oliv´e ER 2010 November 3, 2010 23 / 28 FS = {TaxRate, TaxClass} and K = 10 84 Entity types, 209 Attributes, 183 Relationship types, 28 IsA Relationships, 204 general constraints and derivation rules
  • 58. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples Health Level 7 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28 2,695 Entity types, 160 Attributes, 228 Relationship types, 2,934 IsA Relationships
  • 59. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples Health Level 7 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28 FS = {Patient, ActAppointment} and K = 10 2,695 Entity types, 160 Attributes, 228 Relationship types, 2,934 IsA Relationships
  • 60. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples Health Level 7 A. Villegas and A. Oliv´e ER 2010 November 3, 2010 24 / 28 FS = {Patient, ActAppointment} and K = 10 2,695 Entity types, 160 Attributes, 228 Relationship types, 2,934 IsA Relationships
  • 61. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples ResearchCyc A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28 26,725 Entity types, 1,060 Attributes, 5,514 Relationship types, 43,323 IsA Relationships
  • 62. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples ResearchCyc A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28 FS = {Cancer} and K = 18 26,725 Entity types, 1,060 Attributes, 5,514 Relationship types, 43,323 IsA Relationships
  • 63. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Examples ResearchCyc A. Villegas and A. Oliv´e ER 2010 November 3, 2010 25 / 28 FS = {Cancer} and K = 18 26,725 Entity types, 1,060 Attributes, 5,514 Relationship types, 43,323 IsA Relationships
  • 64. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conclusions The problem of filtering a fragment of the knowledge contained in a large conceptual schema appears in many information systems development activities. People needs to operate for some purpose with a fragment of the knowledge contained in those large schemas. We have proposed a filtering method based on the importance and closeness measures. A user indicates a focus set of entity types, and the method determines a subset of the elements of the large schema that is likely to be of interest to the user. We have experimented with three large schemas. In both cases, our prototype obtains the filtered schema in less than a second. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
  • 65. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conclusions The problem of filtering a fragment of the knowledge contained in a large conceptual schema appears in many information systems development activities. People needs to operate for some purpose with a fragment of the knowledge contained in those large schemas. We have proposed a filtering method based on the importance and closeness measures. A user indicates a focus set of entity types, and the method determines a subset of the elements of the large schema that is likely to be of interest to the user. We have experimented with three large schemas. In both cases, our prototype obtains the filtered schema in less than a second. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
  • 66. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conclusions The problem of filtering a fragment of the knowledge contained in a large conceptual schema appears in many information systems development activities. People needs to operate for some purpose with a fragment of the knowledge contained in those large schemas. We have proposed a filtering method based on the importance and closeness measures. A user indicates a focus set of entity types, and the method determines a subset of the elements of the large schema that is likely to be of interest to the user. We have experimented with three large schemas. In both cases, our prototype obtains the filtered schema in less than a second. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
  • 67. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Conclusions The problem of filtering a fragment of the knowledge contained in a large conceptual schema appears in many information systems development activities. People needs to operate for some purpose with a fragment of the knowledge contained in those large schemas. We have proposed a filtering method based on the importance and closeness measures. A user indicates a focus set of entity types, and the method determines a subset of the elements of the large schema that is likely to be of interest to the user. We have experimented with three large schemas. In both cases, our prototype obtains the filtered schema in less than a second. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 26 / 28
  • 68. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Further Work Take into account the importance of relationship types. Fine-grained filtering of integrity constraints and derivation rules. Conduct experiments to precisely determine the usefulness of our method to real users. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
  • 69. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Further Work Take into account the importance of relationship types. Fine-grained filtering of integrity constraints and derivation rules. Conduct experiments to precisely determine the usefulness of our method to real users. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
  • 70. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions Further Work Take into account the importance of relationship types. Fine-grained filtering of integrity constraints and derivation rules. Conduct experiments to precisely determine the usefulness of our method to real users. A. Villegas and A. Oliv´e ER 2010 November 3, 2010 27 / 28
  • 71. Introduction Filtering Method Filtered Conceptual Schema Experimentation Conclusions A Method for Filtering Large Conceptual Schemas Antonio Villegas and Antoni Oliv´e {avillegas, olive}@essi.upc.edu Services and Information Systems Engineering Department Universitat Polit`ecnica de Catalunya A. Villegas and A. Oliv´e ER 2010 November 3, 2010 28 / 28