Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Data Modeling Fundamentals
Version 1.1
Cristi Salcescu
Subjects
• Relational Modeling
• Dimensional Modeling
• Object Modeling
What is data modeling?
• Apply structure
• Organize
Relational Modeling
• Tables
– Columns and
– Rows

• Keys
– Primary Key
– Foreign key (Referential Integrity)
– Surrogate ...
Types of Relations
•
•
•
•
•

One-to-Many
Many-to-One
Many-to-Many
One-to-One
Recursive
One-to-Many
Persons

Policies

Id

Id

LastName

Serial

FirstName

Number
IssuedDate
BeginDate
EndDate
IdPerson
IdPolicyT...
Many-to-Many
One-to-One
Policies
Id
Serial
Number
IssuedDate
BeginDate
EndDate
IdPerson
IdPolicyType
IdUser

PolciesHousehold
Id
IdAddr...
Many-to-One
Self-Referencing

_Categories
IdCategory
Name
IdParent
Normalization
• creates granularity
• remove duplication
• is a set of cumulative rules (Normal) Forms :
1st, 2nd, 3rd Nor...
1st Normal Form
• creates Many-to-One relation
• removes duplication that occurs horizontally
2nd Normal Form
• Creates One-to-Many relation
• removes duplication that occurs vertically
3rd Normal Form
• Creates Many-to-Many relation
4th Normal Form
• Creates a One-to-One relation
• Separates NULL values
Insurance Policies - Car, Home and Life
Resources
Library of data models
Normalization
SqlRelationship
OLTP vs OLAP
• OLTP : On-line Transaction Processing
• OLAP : On-line Analytical Processing
Why Relational Model fails for...
OLTP
– recent data
– daily basis
– hundreds millions of users
– high concurrency
– designed for working with a single reco...
OLAP
– huge amout of (historical) data
– high speed to access huge amount of data
– access many tables
– low concurency : ...
Dimensional Modeling
• Data Warehouse
–
–
–
–
–

A gigantic storehouse of data
All data
Provides a long term storage of da...
Fact table example
Denormalization
•
•
•
•
•
•

removing Normal Forms
removes granularity
uses lots of space : I/O costs
good for performance...
3rd Normal Form
Denormalized
Relational Model
Denormalize facts tables
Snowflake Schema
Star Schema
Resources
• http://oracledba.ezpowell.com/oracle/papers
/TheVeryBasicsOfDataWarehouseDesign.htm
Object Modeling
• a layer of objects that model the business area
you're working in
UML
Unified Modeling Language
The most basic of UML diagrams is the Class
Diagram. It describes classes and shows the
rela...
Types of Relations
• Inheritance
• Association
• Aggregation
• Composition
Inheritance
Inheritance
A generalizes B
B derives from A

class Relations

A

B
Association
Association
A uses B

class Relations

A

Class field
Methode parameter
Methode Return Type
Local variable

B
Aggregation
Aggregation
Shared Association
A aggregates B
B is part of A

class Relations

A

B

class Relations

Airport
...
Composition
Composition
Not-Shared Association

class Relations

A

B

A is composed of B

class Relations

Person

Leg
Domain Layer
Domain Layer
– Introduced by Eric Evans, in his book “Domain Driven
Design – Tackling Complexity in the Heart...
Insurance – Relational Model
Policies

Persons

Id

Id

Serial

LastName

Number

FirstName

IssuedDate
BeginDate
EndDate
...
Insurance – Object Model
Resources
• http://aviadezra.blogspot.com/2009/05/umlassociation-aggregation-composition.html
Data Flow between the 3 Models
ORM/ ETL
• ORM (Object-relational mapping)
http://www.agiledata.org/essays/mappingObj
ects.html
• ETL (Extract, transform ...
Summary
• Relational Modeling
– Tables (columns, rows)
– Types of Relations
– Normal Forms

• Dimensional Modeling
– Facts...
Resources
• VTC – Data Modeling
• Pluralsight - Introduction to Data Warehousing
Upcoming SlideShare
Loading in …5
×

Data modeling fundamentals

1,261 views

Published on

Published in: Technology
  • http://dbmanagement.info/Tutorials/Erwin.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Data modeling fundamentals

  1. 1. Data Modeling Fundamentals Version 1.1 Cristi Salcescu
  2. 2. Subjects • Relational Modeling • Dimensional Modeling • Object Modeling
  3. 3. What is data modeling? • Apply structure • Organize
  4. 4. Relational Modeling • Tables – Columns and – Rows • Keys – Primary Key – Foreign key (Referential Integrity) – Surrogate Key – Composite Key • is a key that contains more than one column
  5. 5. Types of Relations • • • • • One-to-Many Many-to-One Many-to-Many One-to-One Recursive
  6. 6. One-to-Many Persons Policies Id Id LastName Serial FirstName Number IssuedDate BeginDate EndDate IdPerson IdPolicyType IdUser
  7. 7. Many-to-Many
  8. 8. One-to-One Policies Id Serial Number IssuedDate BeginDate EndDate IdPerson IdPolicyType IdUser PolciesHousehold Id IdAddress Age Surface RoomsNo PoliciesMotor Id ConstructionYear CylCap ChassisNo PlateNo
  9. 9. Many-to-One
  10. 10. Self-Referencing _Categories IdCategory Name IdParent
  11. 11. Normalization • creates granularity • remove duplication • is a set of cumulative rules (Normal) Forms : 1st, 2nd, 3rd Normal Form • good for saving space, but I/O costs are cheap • bad for performance : Joins
  12. 12. 1st Normal Form • creates Many-to-One relation • removes duplication that occurs horizontally
  13. 13. 2nd Normal Form • Creates One-to-Many relation • removes duplication that occurs vertically
  14. 14. 3rd Normal Form • Creates Many-to-Many relation
  15. 15. 4th Normal Form • Creates a One-to-One relation • Separates NULL values
  16. 16. Insurance Policies - Car, Home and Life
  17. 17. Resources Library of data models Normalization SqlRelationship
  18. 18. OLTP vs OLAP • OLTP : On-line Transaction Processing • OLAP : On-line Analytical Processing Why Relational Model fails for Reporting? • too granular • high concurrency (lots of users sharing small pieces at the same time) • too many tables : Joins are too big, SQL code too slow
  19. 19. OLTP – recent data – daily basis – hundreds millions of users – high concurrency – designed for working with a single record/entity at a time – highly “normalized” – getting data for a report involves many joins
  20. 20. OLAP – huge amout of (historical) data – high speed to access huge amount of data – access many tables – low concurency : few users (top executives) – number of tables are reduced, reducing number of joins – Data is de-normalized
  21. 21. Dimensional Modeling • Data Warehouse – – – – – A gigantic storehouse of data All data Provides a long term storage of data Aggregation of data from multiple systems Reduce the load on the production system • Facts – Transactional information – Hold numeric measures • Dimensions – – – – Hold the values that describe facts Static information, or Slowly changing Answer questions like : who, what, when, where? Look up values
  22. 22. Fact table example
  23. 23. Denormalization • • • • • • removing Normal Forms removes granularity uses lots of space : I/O costs good for performance reduces the number of Joins good for large database
  24. 24. 3rd Normal Form
  25. 25. Denormalized
  26. 26. Relational Model
  27. 27. Denormalize facts tables
  28. 28. Snowflake Schema
  29. 29. Star Schema
  30. 30. Resources • http://oracledba.ezpowell.com/oracle/papers /TheVeryBasicsOfDataWarehouseDesign.htm
  31. 31. Object Modeling • a layer of objects that model the business area you're working in
  32. 32. UML Unified Modeling Language The most basic of UML diagrams is the Class Diagram. It describes classes and shows the relationships among them.
  33. 33. Types of Relations • Inheritance • Association • Aggregation • Composition
  34. 34. Inheritance Inheritance A generalizes B B derives from A class Relations A B
  35. 35. Association Association A uses B class Relations A Class field Methode parameter Methode Return Type Local variable B
  36. 36. Aggregation Aggregation Shared Association A aggregates B B is part of A class Relations A B class Relations Airport Aircraft
  37. 37. Composition Composition Not-Shared Association class Relations A B A is composed of B class Relations Person Leg
  38. 38. Domain Layer Domain Layer – Introduced by Eric Evans, in his book “Domain Driven Design – Tackling Complexity in the Heart of Software” @2003 – Entities • An object that is not defined by its attributes, but rather by its identity – Value Objects • An object that contains attributes but has no conceptual identity
  39. 39. Insurance – Relational Model Policies Persons Id Id Serial LastName Number FirstName IssuedDate BeginDate EndDate IdPerson IdPolicyType IdUser PolciesHousehold Id IdAddress Age Surface RoomsNo PoliciesMotor Id ConstructionYear CylCap ChassisNo PlateNo
  40. 40. Insurance – Object Model
  41. 41. Resources • http://aviadezra.blogspot.com/2009/05/umlassociation-aggregation-composition.html
  42. 42. Data Flow between the 3 Models
  43. 43. ORM/ ETL • ORM (Object-relational mapping) http://www.agiledata.org/essays/mappingObj ects.html • ETL (Extract, transform and load)
  44. 44. Summary • Relational Modeling – Tables (columns, rows) – Types of Relations – Normal Forms • Dimensional Modeling – Facts and Dimensions – De-Normalization • Object Modeling – Entities and Values Objects – Inheritance, Aggregation, Association
  45. 45. Resources • VTC – Data Modeling • Pluralsight - Introduction to Data Warehousing

×