Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
PrismTech
The Art and Science of DDS
Data Modelling
Angelo Corsaro, PhD
Chief Technology Officer
ADLINK Technologies Inc.
OM...
CopyrightPrismTech,2014
PrismTech
A Recurring Question
• People new to DDS recurrently ask a question: what are the techni...
The Relational Model
CopyrightPrismTech,2014
PrismTech
Relational Model
• Introduced by Edward Codd in 1970 as a way of representing data model...
CopyrightPrismTech,2014
PrismTech
Relation
• The relation is the construct used representing data in the relational model,...
CopyrightPrismTech,2014
PrismTech
Relation’s Schema
• The relation schema specifies:
- relation’s name
- the name of each ...
CopyrightPrismTech,2014
PrismTech
Tuples
• An instance of a relation is a set of tuples (records) in which each tuple has ...
CopyrightPrismTech,2014
PrismTech
Cardinality and Degree
• The cardinality of a relation R is defined as the number of tup...
CopyrightPrismTech,2014
PrismTech
Keys
• The key of a relation is a set of fields that uniquely identifies a tuple
• A sup...
CopyrightPrismTech,2014
PrismTech
Foreign Keys
• A foreign key allows to introduce a link between two relations
• For inst...
Quick DDS Intro
CopyrightPrismTech,2014
PrismTech
Data Distribution Service (DDS)
• DDS provides a Global Data Space
abstraction that allo...
CopyrightPrismTech,2014
PrismTech
Data Distribution Service (DDS)
• DataWriters and DataReaders are
automatically and dyna...
Information Definition
CopyrightPrismTech,2014
PrismTech
Topic
• A Topic defines a domain-wide information’s class
• A Topic is defined by means ...
CopyrightPrismTech,2014
PrismTech
Topic and Instances
• As explained in the previous slide a topic defines a class/type of...
CopyrightPrismTech,2014
PrismTech
Topic Example
• A Topic type can be defined in different syntaxes
• IDL is the most comm...
CopyrightPrismTech,2014
PrismTech
Topics as Relations
• A Topic cans be seen as defining a relation
sid name age gpa
1234 ...
CopyrightPrismTech,2014
PrismTech
Mapping DDS to the Relational Model
• Topics Types => Relation Schema
• Topic Instance =...
CopyrightPrismTech,2014
PrismTech
Relational Design
• Start identifying corse relations and properties of data
• Start dec...
UML Data Modelling
CopyrightPrismTech,2014
PrismTech
UML Data Modelling
• A subset of UML can be used to model Data Models
• The resulting mo...
CopyrightPrismTech,2014
PrismTech
Class
• A UML class is mapped to a relation that has the same name of the class,
shares ...
CopyrightPrismTech,2014
PrismTech
Association
• By default association can be mapped as follows, yet, depending on the
mul...
CopyrightPrismTech,2014
PrismTech
1-to-many Association
There are two ways of mapping a 1-to-many association to the relat...
CopyrightPrismTech,2014
PrismTech
many-to-many Associations
C1(K1, O1)
C2(K2, O2)
A(K1,K2)K1: PK
O1
C1
K2: PK
O2
C2
A
* *
CopyrightPrismTech,2014
PrismTech
Relationships arity
One to One One to Many Many to Many
K1 K2
K1 K2 K1 K2
Key = K2 Key =...
CopyrightPrismTech,2014
PrismTech
Association Classes
C1(K1, O1)
C2(K2, O2)
A(K1,K2, a1, a2)
K1: PK
O1
C1
K2: PK
O2
C2
A
A...
CopyrightPrismTech,2014
PrismTech
Self Association
• Self association are modelled as traditional relations, which the onl...
CopyrightPrismTech,2014
PrismTech
Subclasses
Three ways of mapping subclassing to the relational model
T1 Subclass relatio...
CopyrightPrismTech,2014
PrismTech
Composition and Aggregation
• The precondition to easily map composition to the relation...
CopyrightPrismTech,2014
PrismTech
Summing Up
• A subset of UML can be used to model relational data models
• The mapping r...
Refinement
CopyrightPrismTech,2014
PrismTech
Why Relation Refinement?
• The UML/ER Data Models provide usually a good starting point t...
CopyrightPrismTech,2014
PrismTech
Redundancy
• Redundant Storage: Information may be stored multiple times leading to
spac...
CopyrightPrismTech,2014
PrismTech
Decomposition
• Unconsidered decomposition can lead more problems than benefits, thus
wh...
CopyrightPrismTech,2014
PrismTech
Functional Dependencies
• A Functional Dependency (FD) is a kind of Integrity Constraint...
CopyrightPrismTech,2014
PrismTech
Example
• Let’s assume our Student relation now includes a new attribute that measure th...
Normal Forms
CopyrightPrismTech,2014
PrismTech
Normal Forms
• Different Normal Forms (NF) exist that provide guidance on how to decompo...
CopyrightPrismTech,2014
PrismTech
1NF
• A relation is in 1NF if every field contains only atomic values, that is not lists...
CopyrightPrismTech,2014
PrismTech
Boyce-Codd Normal Form (BCNF)
Let R be a relation, X a subset of attributes of R and a a...
CopyrightPrismTech,2014
PrismTech
BCNF Decomposition Algorithm
Input: relation R and FDs for R
Output: decomposition of R ...
CopyrightPrismTech,2014
PrismTech
3NF
Let R be a relation schema, X a subset of attributes of R and a an attribute of R.
R...
CopyrightPrismTech,2014
PrismTech
Shortcomings of BCNF and 4NF
• Dependency enforcement may require joins
• Query workload...
Relational Algebra
CopyrightPrismTech,2014
PrismTech
Selection and Projection
• Relational algebra provides operators to select rows (σ) an t...
CopyrightPrismTech,2014
PrismTech
Joins
• Join is one of the most useful operator in relational algebra and is most
common...
CopyrightPrismTech,2014
PrismTech
Condition Joins
• Condition joins are the most general form of joins. This operation tak...
CopyrightPrismTech,2014
PrismTech
Equijoin
• Equijoin is a special case of the Condition Join, where the condition
predicat...
CopyrightPrismTech,2014
PrismTech
Natural Join
• A Natural Join is a special Equijoin that operates on all the attributes ...
Back to DDS
CopyrightPrismTech,2014
PrismTech
Relational Design in DDS
• Start identifying corse relations and properties of data
• St...
CopyrightPrismTech,2014
PrismTech
Relational Algebra
• DDS Supports:
- Selection for a given Topic DDS queries and filters...
CopyrightPrismTech,2014
PrismTech
DDS Specific Decomposition
• In some instances you may find that a topic (relation) R has...
CopyrightPrismTech,2014
PrismTech
Frequency Mix
• Suppose you have a relation R(K, X,Y) were the set of attributes X chang...
CopyrightPrismTech,2014
PrismTech
Reliability Mix
• Suppose you have a relation R(K, X,Y) were the set of attributes Y rep...
CopyrightPrismTech,2014
PrismTech
Durability Mix
• Suppose you have a relation R(K, X,Y) were the set of attributes X requ...
Summing Up
CopyrightPrismTech,2014
PrismTech
Concluding Remarks
• The relational model provides the right set of tools for designing ...
Learn More…
CopyrightPrismTech,2014
PrismTech
Books
• A First Course in Database Systems (3rd edition), Ullman and Widom
• Database Ma...
CopyrightPrismTech,2014
PrismTech
coursera.org
• Jennifer Widom, Stanfords, Introduction to Databases
- A very very good c...
Extras
ER Modelling
CopyrightPrismTech,2014
PrismTech
Entity Relationship(ER) Data Model
• Relational Data Models are commonly expressed using...
CopyrightPrismTech,2014
PrismTech
Entities, Attributes and Entity Sets
• An entity is an object in the real world that is ...
CopyrightPrismTech,2014
PrismTech
Mapping
Student
name
sid
age
gpa sid name age gpa
1234 Peter Parker 21 4.0
2345 Tony Sta...
CopyrightPrismTech,2014
PrismTech
Relationships
• A relationship is an association between two or more entities
- e.g., a ...
CopyrightPrismTech,2014
PrismTech
Mapping
• A relationship Set is mapped to a relation
• The attributes of the resulting r...
CopyrightPrismTech,2014
PrismTech
Entity Hierarchies
• In some cases it is natural to introduce (type)
hierarchies among e...
CopyrightPrismTech,2014
PrismTech
Mapping
ISA relationships can be mapped into two ways
• Map each entity to a distinct re...
Upcoming SlideShare
Loading in …5
×

The Art and Science of DDS Data Modeling

556 views

Published on

The Data Distribution Service (DDS) is a standard for ubiquitous, interoperable, secure, platform independent, and real-time data sharing across network connected devices. DDS is today used and recommended in a large class of application domains, such as Industrial Internet of Things (IIoT), Defense and Aerospace, Transportation, Robotics, Energy, and Medical. Differently from traditional message-centric technologies, DDS is data-centric – the accent is on seamless (user-defined) data sharing as opposed to message delivery. Therefore, when embracing DDS and data-centricity, data modeling becomes a key step in the design of a distributed system. This presentation will (1) explain the role and scope of data modeling in DDS, (2) introduce the techniques at the foundation of effective and extensible Data Models, and (3) summarize the most common DDS Data Modeling Idioms.

Published in: Software
  • Be the first to comment

  • Be the first to like this

The Art and Science of DDS Data Modeling

  1. 1. PrismTech The Art and Science of DDS Data Modelling Angelo Corsaro, PhD Chief Technology Officer ADLINK Technologies Inc. OMG DDS SIG Co-Chair angelo.corsaro@adlinktech.com
  2. 2. CopyrightPrismTech,2014 PrismTech A Recurring Question • People new to DDS recurrently ask a question: what are the techniques and patterns that we can use to design DDS-based Systems? • My answer is usually: Start with the powerful tools and techniques provided by relational data modelling and then add some DDS-specific spice • I’ve come to the conclusion that many people are not very familiar with relational data modelling, or perhaps it is way too long that they have studied/reviewed these concepts • This webcast, will provide a relatively well introduction to the relational data model
  3. 3. The Relational Model
  4. 4. CopyrightPrismTech,2014 PrismTech Relational Model • Introduced by Edward Codd in 1970 as a way of representing data models for Data Bases • Simple and Elegant: A database becomes a collections of one or more relations where each relation is a table with rows and columns
  5. 5. CopyrightPrismTech,2014 PrismTech Relation • The relation is the construct used representing data in the relational model, it consists of two dimensional table • The columns of a relation are called attributes • The name of the relation along with the set of attributes defines the relation schema • The rows of the relation, other than the header containing the attribute names, are called tuples
  6. 6. CopyrightPrismTech,2014 PrismTech Relation’s Schema • The relation schema specifies: - relation’s name - the name of each field/attribute, e.g. column - the domain of each field, e.g. the type of the field • Example: - Student(sid: string, name: string, age: integer, gpa: real)
  7. 7. CopyrightPrismTech,2014 PrismTech Tuples • An instance of a relation is a set of tuples (records) in which each tuple has the same number of fields as in the relation schema. • A relation’s instance can be visualised as table where each tuple is a row and all rows have the same number of fields (columns) • Notice that rows are all different. This is a requirement of the relational model, as a relation instance is a collection of unique tuples (or rows) sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5
  8. 8. CopyrightPrismTech,2014 PrismTech Cardinality and Degree • The cardinality of a relation R is defined as the number of tuples belonging to the relation • The degree, or arity, of a relation R is defined as the number of its fields
  9. 9. CopyrightPrismTech,2014 PrismTech Keys • The key of a relation is a set of fields that uniquely identifies a tuple • A superkey is a set of attributes that includes the primary key • Example: - The sid field is the key for the Students relations sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5
  10. 10. CopyrightPrismTech,2014 PrismTech Foreign Keys • A foreign key allows to introduce a link between two relations • For instance, the sid in the Courses relation is a foreign key allow to refer as well as introduce an integrity constraint to the students relations sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5 cid sid grade Physics303 1234 A+ Robotics323 2345 A+ Calculus343 2345 A Courses Students
  11. 11. Quick DDS Intro
  12. 12. CopyrightPrismTech,2014 PrismTech Data Distribution Service (DDS) • DDS provides a Global Data Space abstraction that allow applications to autonomously, anonymously, securely and efficiently share data • DDS’ Global Data Space is fully distributed, highly efficient and scalable
  13. 13. CopyrightPrismTech,2014 PrismTech Data Distribution Service (DDS) • DataWriters and DataReaders are automatically and dynamically matched by the DDS Discovery • A rich set of QoS allows to control existential, temporal, and spatial properties of data
  14. 14. Information Definition
  15. 15. CopyrightPrismTech,2014 PrismTech Topic • A Topic defines a domain-wide information’s class • A Topic is defined by means of a (name, type, qos) tuple, where - name: identifies the topic within the domain - type: is the programming language type associated with the topic. Types are extensible and evolvable - qos: is a collection of policies that express the non- functional properties of this topic, e.g. reliability, persistence, etc. Topic Type Name QoS
  16. 16. CopyrightPrismTech,2014 PrismTech Topic and Instances • As explained in the previous slide a topic defines a class/type of information • Topics can be defined as Singleton or can have multiple Instances • Topic Instances are identified by means of the topic key • A Topic Key is identified by a tuple of attributes -- like in databases • Remarks: - A Singleton topic has a single domain-wide instance - A “regular” Topic can have as many instances as the number of different key values, e.g., if the key is an 8-bit character then the topic can have 256 different instances
  17. 17. CopyrightPrismTech,2014 PrismTech Topic Example • A Topic type can be defined in different syntaxes • IDL is the most commonly used syntax • Example: Topic Type Name QoS struct Student { long sid; string name; int age; float gpa; }; #pragma keylist Student sid
  18. 18. CopyrightPrismTech,2014 PrismTech Topics as Relations • A Topic cans be seen as defining a relation sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5 struct Student { long sid; string name; int age; float gpa; }; #pragma keylist Student sid Student(sid, name, age, gpa)
  19. 19. CopyrightPrismTech,2014 PrismTech Mapping DDS to the Relational Model • Topics Types => Relation Schema • Topic Instance => Key • Topic Sample => Tuple
  20. 20. CopyrightPrismTech,2014 PrismTech Relational Design • Start identifying corse relations and properties of data • Start decomposing based on properties • Apply a normal form - Functional Dependencies => Boyce-Codd Normal Form - Multivalued Dependencies => Fourth Normal Form
  21. 21. UML Data Modelling
  22. 22. CopyrightPrismTech,2014 PrismTech UML Data Modelling • A subset of UML can be used to model Data Models • The resulting model can be easily translated into a relational model and the used in a DBMS or DDS • The allowed subset of UML are: - Classes (with only attributes) - Associations - Association Classes - Subclasses - Composition and Aggregation • UML Data Models can be automatically translated into relational model as far as each “regular” class defines a primary key
  23. 23. CopyrightPrismTech,2014 PrismTech Class • A UML class is mapped to a relation that has the same name of the class, shares its key and attributes sid: int name: string age: int gpa: float Student Student(sid, name, age, gpa)
  24. 24. CopyrightPrismTech,2014 PrismTech Association • By default association can be mapped as follows, yet, depending on the multiplicity of the association different mappings may be possible/desirable • The key definition in the association depends on the multiplicity C1(K1, O1) C2(K2, O2) A(K1,K2)K1: PK O1 C1 K2: PK O2 C2 A
  25. 25. CopyrightPrismTech,2014 PrismTech 1-to-many Association There are two ways of mapping a 1-to-many association to the relational model M1 Use a relation to capture the association M2 Embed the association on the many side of the association M1 C1(K1, O1), C2(K2, O2), A(K1, K2) M2 C1(K1, O1), C2(K2, O2, K1) K1: PK O1 C1 K2: PK O2 C2A 0..1 *
  26. 26. CopyrightPrismTech,2014 PrismTech many-to-many Associations C1(K1, O1) C2(K2, O2) A(K1,K2)K1: PK O1 C1 K2: PK O2 C2 A * *
  27. 27. CopyrightPrismTech,2014 PrismTech Relationships arity One to One One to Many Many to Many K1 K2 K1 K2 K1 K2 Key = K2 Key = K1, K2
  28. 28. CopyrightPrismTech,2014 PrismTech Association Classes C1(K1, O1) C2(K2, O2) A(K1,K2, a1, a2) K1: PK O1 C1 K2: PK O2 C2 A A Association
  29. 29. CopyrightPrismTech,2014 PrismTech Self Association • Self association are modelled as traditional relations, which the only difference that attributes mau be conserved sid: int name: string age: int gpa: float Student * * Slbling Student(sid, name, age, gpa) Sibling(sidParent, sidSibling) tsdotd14
  30. 30. CopyrightPrismTech,2014 PrismTech Subclasses Three ways of mapping subclassing to the relational model T1 Subclass relations contain the superclass key and the specialised attributes T2 Subclass relations contain all attributes T3 One relation containing all superclass and subclass attributes T1 A(K, X), B(K, Y), C(K, Z) T2 A(K, X), B(K, X, Y), C(K, X, Z) T3 A(K, X, Y, Z) The best translation may depend on the the context, e.g. T3 good for heavily overlapping subclasses, T2 good for disjoint and complete subclasses K: PK X A Y B Z C
  31. 31. CopyrightPrismTech,2014 PrismTech Composition and Aggregation • The precondition to easily map composition to the relational model is for the part not to have a key K: PK W Whole P Part Whole(K, W) Part(P, K) • When mapping aggregation (unfilled diamond), the key K on the Part should have a domain that allows for null values
  32. 32. CopyrightPrismTech,2014 PrismTech Summing Up • A subset of UML can be used to model relational data models • The mapping rules can be used to help translating existing Object Oriented data models into their relational counter-part
  33. 33. Refinement
  34. 34. CopyrightPrismTech,2014 PrismTech Why Relation Refinement? • The UML/ER Data Models provide usually a good starting point toward the data model that we’ll actually use in the system • The relations implied by the UML/ER Data Model often need to be normalised and re-organised to address performances and workload criteri • The goal of relation refinements is to remove redundancy and/or decompose a relation with smaller relations • Normal forms provide a way of measuring the amount of redundancy that may be in our data model
  35. 35. CopyrightPrismTech,2014 PrismTech Redundancy • Redundant Storage: Information may be stored multiple times leading to space, and perhaps time, inefficiencies • Update Anomalies: If one copy of the redundant information is update this may create inconsistencies in other copies — unless all copies are updated at the same time • Insertion Anomalies: It may not be possible to store some information, unless some other information is stored as well • Deletion Anomalies: It may not be possible to delete some information without loosing som other information as well
  36. 36. CopyrightPrismTech,2014 PrismTech Decomposition • Unconsidered decomposition can lead more problems than benefits, thus when decomposing you always want to ensure that: - You really need to decompose the relation - You fully understand the implications of the decomposition (lossless join, dependency preservation) • Normal Forms provide good guidelines for relations decompositions as they guarantees that certain class of problems cannot be introduced • Notice that decomposition can have a performance impact as it may lead to an increase in joins
  37. 37. CopyrightPrismTech,2014 PrismTech Functional Dependencies • A Functional Dependency (FD) is a kind of Integrity Constraint (IC) that generalises the concept of a key • Given a relation R along with two nonempty sets of attributes X and Y in R, we say that R satisfies the FD X ⟶ Y (X determines Y) if the following holds for every pair of tuples t1 and t2 in R: • In other terms, the FD says that if two tuple agree on the set of attributes on X they also agree on the set of attributes in Y • Notice that a primary key constraint is a special kind of FD if t1.X = t2.X then t1.Y = t2.Y
  38. 38. CopyrightPrismTech,2014 PrismTech Example • Let’s assume our Student relation now includes a new attribute that measure the percentile of the student GPA, e.g. which percentage of students has a GPA that is smaller of equal • Clearly we have that the percentile attribute functionally depends on gpa, or equivalently gpa ⟶ percentile sid name age gpa percentile 1234 Peter Parker 21 4.0 100 2345 Tony Stark 15 4.0 100 3456 Bruce Wayne 23 3.5 75
  39. 39. Normal Forms
  40. 40. CopyrightPrismTech,2014 PrismTech Normal Forms • Different Normal Forms (NF) exist that provide guidance on how to decompose relations • If a relation is in a given normal form then we are guarantees that some anomalies cannot arise, e.g. update anomaly, etc. • The normal forms based on functional dependencies are the first normal form (1FN), second normal form (2FN), third normal form (3NF) and the Boyce-Codd normal form (BCNF) • Every relation in BCNF is also in 3NF, every relation in 3FN is also in 2FN and finally every relation in 2NF is also in 1NF • The 2NF and 3NF have only historical interest, while the BCNF has important practical applicability
  41. 41. CopyrightPrismTech,2014 PrismTech 1NF • A relation is in 1NF if every field contains only atomic values, that is not lists, or sets
  42. 42. CopyrightPrismTech,2014 PrismTech Boyce-Codd Normal Form (BCNF) Let R be a relation, X a subset of attributes of R and a an attribute of R. R is in Boyce-Codd Normal Form (BCNF) if for every FD: X ⟶ {a} that holds over R, one of the following is true: • a ∊ X, that is it is a trivial FD, or • X is a superkey Intuitively, in a BCNF relation the only nontrivial dependencies are those in which a key determines some attributes. Each attribute must describe the key, the whole key, and nothing but the key key attr 1 attr 2 attr k Functional Dependencies in BCNF
  43. 43. CopyrightPrismTech,2014 PrismTech BCNF Decomposition Algorithm Input: relation R and FDs for R Output: decomposition of R into BCNF relations with lossless join Compute Keys for R Repeat until all relations are in BCNF Choose a relation Ri with A ⟶ B that violates BCNS Decompose Ri into R1(A, B) and R2(A, rest) Compute FDs for R1 and R2 Compute Keys for R1 and R2
  44. 44. CopyrightPrismTech,2014 PrismTech 3NF Let R be a relation schema, X a subset of attributes of R and a an attribute of R. R is in Third Normal Form if for every FD: X ⟶ {a} that holds over R, one of the following is true: • a ∊ X, that is it is a trivial FD, or • X is a superkey, or • a is part of some key for R The definition of 3NF is similar to that of BCNF, with the difference that a may be part of a key for R
  45. 45. CopyrightPrismTech,2014 PrismTech Shortcomings of BCNF and 4NF • Dependency enforcement may require joins • Query workload — due to excessive joins • Over-decomposition
  46. 46. Relational Algebra
  47. 47. CopyrightPrismTech,2014 PrismTech Selection and Projection • Relational algebra provides operators to select rows (σ) an to project columns from a relation (π) • These operation allow to operate on a single relation Examples: sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5 σage<20 (Student) sid name age gpa 2345 Tony Stark 15 4.0 Student πname,gpa(Student) name gpa Peter Parker 4.0 Tony Stark 4.0 Bruce Wayne 3.5
  48. 48. CopyrightPrismTech,2014 PrismTech Joins • Join is one of the most useful operator in relational algebra and is most commonly used to combine/reassemble information from two or more relations • Join is conceptually a cross product followed by a selection and projection
  49. 49. CopyrightPrismTech,2014 PrismTech Condition Joins • Condition joins are the most general form of joins. This operation takes a condition and two relations and is defined as follows: R ⋈c C = σc(RxS)
  50. 50. CopyrightPrismTech,2014 PrismTech Equijoin • Equijoin is a special case of the Condition Join, where the condition predicates on attribute equality
  51. 51. CopyrightPrismTech,2014 PrismTech Natural Join • A Natural Join is a special Equijoin that operates on all the attributes having the same name in R and S
  52. 52. Back to DDS
  53. 53. CopyrightPrismTech,2014 PrismTech Relational Design in DDS • Start identifying corse relations and properties of data • Start decomposing based on properties (can use UML for this) • Apply a normal form - Functional Dependencies => Boyce-Codd Normal Form - Multivalued Dependencies => Fourth Normal Form • Define QoS for the resulting relations and further decompose if you incur in some QoS Mix (more later)
  54. 54. CopyrightPrismTech,2014 PrismTech Relational Algebra • DDS Supports: - Selection for a given Topic DDS queries and filters - Projections and Conditional Joins across multiple Topics via the Multi-Topics • DDS uses a subset of SQL-92 to express selections, projections and joins
  55. 55. CopyrightPrismTech,2014 PrismTech DDS Specific Decomposition • In some instances you may find that a topic (relation) R has two disjoint sets of attribute X and Y that have conflicting temporal, reliability or durability requirements • In this case this relation has to be further decomposed
  56. 56. CopyrightPrismTech,2014 PrismTech Frequency Mix • Suppose you have a relation R(K, X,Y) were the set of attributes X changes far more frequently than the set of attributes Y (e.g. position, vs. velocity) • In this case you should decompose the relation R into: • This will reduce the resource usage in your system, e.g. bandwidth as well as CPU but may introduce consistency issues. If consistency is essential then coherent updates should be used to atomically update R1 and R2 R1(K, X), R2(K, Y)
  57. 57. CopyrightPrismTech,2014 PrismTech Reliability Mix • Suppose you have a relation R(K, X,Y) were the set of attributes Y represent some soft-state. • In this case you should decompose the relation R into: • This decomposition allows to only use reliable distribution for R1 and best- effort for R2 thus reducing resource usage in the system R1(K, X), R2(K, Y)
  58. 58. CopyrightPrismTech,2014 PrismTech Durability Mix • Suppose you have a relation R(K, X,Y) were the set of attributes X requires a different durability than the set of attributes Y, e.g. X need sto be persistent while Y volatile • In this case you should decompose the relation R into: • This will reduce the resource usage in your system and reduce the pressure on the Durability Service R1(K, X), R2(K, Y)
  59. 59. Summing Up
  60. 60. CopyrightPrismTech,2014 PrismTech Concluding Remarks • The relational model provides the right set of tools for designing DDS-based systems • DDS Topics are relations and DDS supports a subset of relational algebra to manipulate these relations (topics) • The design process is as follows: - Start modelling your system using the UML Data Modelling subset - Ensure your model is in BCNF or 4NF — make sure your understand why some violations are necessary/desirable for your system - Add QoS to your relations - Evaluate if further decomposition is required due to QoS mixes — if your data model is properly normalised
  61. 61. Learn More…
  62. 62. CopyrightPrismTech,2014 PrismTech Books • A First Course in Database Systems (3rd edition), Ullman and Widom • Database Management Systems (3rd edition), Ramakrishnan and Gehrke
  63. 63. CopyrightPrismTech,2014 PrismTech coursera.org • Jennifer Widom, Stanfords, Introduction to Databases - A very very good course on Databases in general and specifically on relational data modelling
  64. 64. Extras
  65. 65. ER Modelling
  66. 66. CopyrightPrismTech,2014 PrismTech Entity Relationship(ER) Data Model • Relational Data Models are commonly expressed using, some variation of, Entity-Relationship (ER) Data Models • The ER Data Model is built around the concepts of entities, attributes and relationships (not to be confused with relations!)
  67. 67. CopyrightPrismTech,2014 PrismTech Entities, Attributes and Entity Sets • An entity is an object in the real world that is distinguishable from other objects - e.g. the iPhone, the Samsumg Galaxy Note, etc. • An entity is described through a set of attributes • An entity set identifies a collections of similar entities - e.g., Mobile Phones • Each attribute associated with an entity set must identify its domain • An entity has a primary key and potentially several candidate keys
  68. 68. CopyrightPrismTech,2014 PrismTech Mapping Student name sid age gpa sid name age gpa 1234 Peter Parker 21 4.0 2345 Tony Stark 15 4.0 3456 Bruce Wayne 23 3.5 Student Entity Set Student Entity Set • An entity set is mapped to a relation
  69. 69. CopyrightPrismTech,2014 PrismTech Relationships • A relationship is an association between two or more entities - e.g., a student is enrolled in a course • A relationship can have descriptive attribute to record information about a relationship
  70. 70. CopyrightPrismTech,2014 PrismTech Mapping • A relationship Set is mapped to a relation • The attributes of the resulting relation are: - the primary key of each participating entity as foreign keys - descriptive attributes as fields of the relation • The primary key of the resulting relations depends on arity of the relationship
  71. 71. CopyrightPrismTech,2014 PrismTech Entity Hierarchies • In some cases it is natural to introduce (type) hierarchies among entities • These hierarchies are represented through the ISA relationship Employees namessn HourlyEmpls hoursWorked hourlyWages ISA ContractEmpls contractId
  72. 72. CopyrightPrismTech,2014 PrismTech Mapping ISA relationships can be mapped into two ways • Map each entity to a distinct relation • Create only relations for the concrete types Notice that while the first approach is always applicable, the second is not

×