Data Management & Warehousing                      PROCESS NEUTRAL DATA MODELLING                                CONCEPTS ...
Agenda•! The Issues With Conventional Data Warehouse   Data Models•! Assumptions About The Data Model To Be   Constructed•...
Data Management & Warehousing                                THE ISSUES WITH                          CONVENTIONAL DATA WA...
Issues•! Data models take a long time to develop•! Data models are expensive to change        –! Affects Source -> Data Wa...
Issues•! A struggle to keep up with rapidly changing   source system data models•! Reference data is often not stored in a...
Data Management & Warehousing                                       ASSUMPTIONS ABOUT                                     ...
Assumptions•! Used in data warehouse        –! Not in the operational systems or the data marts        –! Different style ...
Assumptions•! Direct updates will be prohibited        –! A separate application or applications will exist as a          ...
Data Management & Warehousing                                       REQUIREMENTS OF A                                     ...
Requirements•! Uses A Design Pattern        –! General reusable approaches and solutions to           commonly occurring p...
Requirements•! DRY (Don’t Repeat Yourself)        –! Reduce duplication because it:                 •! Increases the diffi...
Requirements•! The data model should store data at the lowest   possible level        –! Information stored at the transac...
Requirements•! Completely time-variant        –! It should be possible to reconstruct all information at           any poi...
Requirements•! Uses Standard BI Relational Databases        –! Ensure that the solution can be deployed on any           c...
Data Management & Warehousing                                       CONSTRUCTING THE                                      ...
Who is the customer?                                                           •! Everyone has a                          ...
More problems …•! Some of the customers are suppliers as well•! Some businesses have separate divisions that   have to be ...
The Party•! These problem arises because the data is being   looked at in terms of current business process•! In fact ther...
Attributes of Party•! The attributes of ‘Party’ will be those that remain   static over the life of the record        –! S...
PARTIES Data Model                                                                       PARTIES                          ...
Supporting Non-Lifetime Attributes•! Need to add data for different Party Types        –! Marital Status for Individuals  ...
PARTY_PROPERTIES Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE                                ...
Relationships between Parties•! Parties have relationships        –! David Walker works in Professional Services        –!...
PARTY_LINKS Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE •!PARTY_END_DATE                    ...
Segments of Parties•! Grouping Parties together because at some   point in time they shared characteristics•! This is know...
PARTY_SEGMENTS Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE •!PARTY_END_DATE                 ...
Understanding The Conventions•! All Type tables have the same format        –! Categorisation•! All Property tables have t...
Introducing Major Entities•! Party is a Major Entity        –! These are entities that exist regardless of the           b...
Data Models For Other Major Entities•! Geography        –! Geography Types                 •! Postal Addresses, GPS Co-ord...
Major Entity Sub Model     Major Entity Sub-Model                                                      MAJOR ENTITY       ...
Relationships Between Major Entities•! Storing names with multiple addresses and   multiple electronic addresses (e-mail, ...
Party -> (Electronic) Addresses                                  PARTY_                                  ADDRESS_         ...
Party -> Usage (Simplified)              PARTY                               CONTRACT                                     ...
Extending the Data Model•! Identify as many Major Entities as possible        –! But remember there are only a finite numb...
Does this help meet requirements?!! Uses A Design Pattern!! Convention Over Configuration!! DRY (Don’t Repeat Yourself)!! ...
Some Key Elements•! Self Similar modelling        –!   All _TYPE tables have the same structure, etc.        –!   Naming c...
Is this all there is to it ?•! At a high level – YES•! BUT:        –! There are methods for dealing with data quality     ...
Further Reading•! Available From http://www.datamgmt.com•! White Papers        –! Overview Architecture for Enterprise Dat...
Data Management & Warehousing                                       Thank you !!        Website:   http://www.datamgmt.com...
Upcoming SlideShare
Loading in …5
×

ETIS08 - Process Neutral Data Modelling Concepts - Presentation

1,296 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,296
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ETIS08 - Process Neutral Data Modelling Concepts - Presentation

  1. 1. Data Management & Warehousing PROCESS NEUTRAL DATA MODELLING CONCEPTS DAVID M WALKER ETIS COMMUNITY GATHERING 13-14 NOVEMBER 2008 - BRUSSELS© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 1David M Walker 14 November 2008
  2. 2. Agenda•! The Issues With Conventional Data Warehouse Data Models•! Assumptions About The Data Model To Be Constructed•! Requirements Of A Data Warehouse Data Model•! Constructing The Data Warehouse Data Model© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 2David M Walker 14 November 2009
  3. 3. Data Management & Warehousing THE ISSUES WITH CONVENTIONAL DATA WAREHOUSE DATA MODELS© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 3David M Walker 14 November 2009
  4. 4. Issues•! Data models take a long time to develop•! Data models are expensive to change –! Affects Source -> Data Warehouse ETL –! Affects Data Warehouse -> Data Mart ETL•! The design often reflects the first or largest source system –! This makes it difficult to add other systems•! They often reflect current working practice –! Making it difficult to change when the business does© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 4David M Walker 14 November 2009
  5. 5. Issues•! A struggle to keep up with rapidly changing source system data models•! Reference data is often not stored in a time variant way•! History is lost with data model changes•! Queries directly on the data warehouse are complex•! Different rules apply to query each table•! Different database platforms have different needs© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 5David M Walker 14 November 2009
  6. 6. Data Management & Warehousing ASSUMPTIONS ABOUT THE DATA MODEL TO BE CONSTRUCTED© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 6David M Walker 14 November 2009
  7. 7. Assumptions•! Used in data warehouse –! Not in the operational systems or the data marts –! Different style of modelling required•! Users not going to query the data model –! Users will query separate dependent data marts•! Data will be extracted from the model to populate the data marts by ETL tools•! Data will be loaded into the model from the source systems by ETL tools© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 7David M Walker 14 November 2009
  8. 8. Assumptions•! Direct updates will be prohibited –! A separate application or applications will exist as a surrogate source and ETL used to load the data•! Not a ‘mixed mode’ database –! Some parts using one data modelling convention and other parts using another –! This is bad practice with any modelling technique!© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 8David M Walker 14 November 2009
  9. 9. Data Management & Warehousing REQUIREMENTS OF A DATA WAREHOUSE DATA MODEL© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 9David M Walker 14 November 2009
  10. 10. Requirements•! Uses A Design Pattern –! General reusable approaches and solutions to commonly occurring problems that can be used in many different situations•! Convention Over Configuration –! Decrease the number of decisions that designers / developers need to make, gaining simplicity, without losing flexibility –! Achieved by ensuring that tables and columns use a standard structures, naming convention, etc. and are populated and queried in a consistent fashion© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 10David M Walker 14 November 2008
  11. 11. Requirements•! DRY (Don’t Repeat Yourself) –! Reduce duplication because it: •! Increases the difficulty of changing the model •! Decreases the clarity of the model •! Leads to opportunities for inconsistency•! Static over a long period of time –! No need to add or modify tables on a regular basis –! Note: There is a difference between designed and implemented, it is possible to have designed a table but not to implement it until it is actually required© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 11David M Walker 14 November 2008
  12. 12. Requirements•! The data model should store data at the lowest possible level –! Information stored at the transaction level –! Avoid the storage of aggregates•! Supports the best use of platform specific features without compromising the design –! Where available supports: •! Partitioning •! Column Storage •! Many Insert/Few Update strategies© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 12David M Walker 14 November 2009
  13. 13. Requirements•! Completely time-variant –! It should be possible to reconstruct all information at any point in time•! Communication tool –! Aids the refinement of requirements –! Aids the explanation of possibilities –! Develops confidence from the user© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 13David M Walker 14 November 2009
  14. 14. Requirements•! Uses Standard BI Relational Databases –! Ensure that the solution can be deployed on any current platform and, if necessary, re-deployed on a future platform•! Process Neutral –! It will not reflect past, current or planned business processes, practices or dependencies –! Stores the data items and relationships as defined by their use at the point in time when the information is created and acquired© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 14David M Walker 14 November 2008
  15. 15. Data Management & Warehousing CONSTRUCTING THE DATA WAREHOUSE DATA MODEL© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 15David M Walker 14 November 2009
  16. 16. Who is the customer? •! Everyone has a different definition •! Everyone needs a different information •! Users have conflicting definitions •! Customer can be individuals or businesses© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 16David M Walker 14 November 2009
  17. 17. More problems …•! Some of the customers are suppliers as well•! Some businesses have separate divisions that have to be handled separately•! Some customers interact with different divisions within our organisation•! Some individuals or organisations also perform other roles –! e.g. legal, re-sellers, partners, etc.© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 17David M Walker 14 November 2009
  18. 18. The Party•! These problem arises because the data is being looked at in terms of current business process•! In fact there is no customer entity, just different types of party –! Individuals, Organisations, Organisational Units –! Concept of Party identical to that in contract law•! The role of customer is defined not by the table definition but by the usage of party data with other information held (e.g. the purchase transaction relating to a product)© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 18David M Walker 14 November 2009
  19. 19. Attributes of Party•! The attributes of ‘Party’ will be those that remain static over the life of the record –! State ID Number, Name, Start Date, End Date –! These attributes have ‘lifetime value’•! Attributes that change need to be stored elsewhere•! The Party table needs to be categorised or typed –! Individual, Organisation, Organisation Unit© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 19David M Walker 14 November 2009
  20. 20. PARTIES Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE PARTY_TYPES •!PARTY_END_DATE •!PARTY_TYPE_DWK •!PARTY_TYPE_DWK •!PARTY_TYPE •!PARTY_TYPE_DESC •!PARTY_TYPE_GROUP •!PARTY_TYPE_START_DATE •!PARTY_TYPE_END_DATE© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 20David M Walker 14 November 2009
  21. 21. Supporting Non-Lifetime Attributes•! Need to add data for different Party Types –! Marital Status for Individuals –! Number of Children for Individuals –! Number of Employees for Organisations –! Turnover for Organisations•! Need to add data that changes over the lifetime of the party –! Usually the same attributes that are needed for different Party Types© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 21David M Walker 14 November 2009
  22. 22. PARTY_PROPERTIES Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE PARTY_PROPERTIES •!PARTY_END_DATE •!PARTY_DWK •!PARTY_TYPE_DWK •!PARTY_PROPERTY_TYPE_DWK •!PARTY_START_DATE •!PARTY_END_DATE PARTY_PROPERTY_TYPES •!PARTY_PROPERTY_VALUE •!PARTY_PROPERTY_TYPE_DWK •!PARTY_PROPERTY_TYPE •!PARTY_PROPERTY_TYPE_DESC •!PARTY_PROPERTY_TYPE_GROUP •!PARTY_PROPERTY_TYPE_START_DATE •!PARTY_PROPERTY_TYPE_END_DATE© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 22David M Walker 14 November 2009
  23. 23. Relationships between Parties•! Parties have relationships –! David Walker works in Professional Services –! David Walker is employed by Data Management & Warehousing –! David Walker is married to Helen walker•! This is known as a Peer-To-Peer relationship•! This is the first place that we see a role defined by a relationship© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 23David M Walker 14 November 2009
  24. 24. PARTY_LINKS Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE •!PARTY_END_DATE PARTY_LINKS •!PARTY_TYPE_DWK •!PARTY_DWK •!LINKED_PARTY_DWK •!PARTY_LINK_TYPE_DWK PARTY_LINK_TYPES •!PARTY_START_DATE •!PARTY_LINK_TYPE_DWK •!PARTY_END_DATE •!PARTY_LINK_TYPE •!PARTY_LINK_TYPE_DESC •!PARTY_LINK_TYPE_GROUP •!PARTY_LINK_TYPE_START_DATE •!PARTY_LINK_TYPE_END_DATE© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 24David M Walker 14 November 2009
  25. 25. Segments of Parties•! Grouping Parties together because at some point in time they shared characteristics•! This is known as a Peer Group Relationship•! Examples –! Married people with two or more children –! IT companies with less than <100 employees•! Usually generated by analysis and the results stored•! Most commonly seen in market segmentation type applications© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 25David M Walker 14 November 2009
  26. 26. PARTY_SEGMENTS Data Model PARTIES •!PARTY_DWK •!PARTY_ID •!PARTY_NAME •!PARTY_START_DATE •!PARTY_END_DATE PARTY_SEGMENTS •!PARTY_TYPE_DWK •!PARTY_DWK •!PARTY_SEGMENT_TYPE_DWK •!PARTY_START_DATE PARTY_SEGMENT_TYPES •!PARTY_END_DATE •!PARTY_SEGMENT_TYPE_DWK •!PARTY_SEGMENT_TYPE •!PARTY_SEGMENT_DESC •!PARTY_SEGMENT_GROUP •!PARTY_SEGMENT_START_DATE •!PARTY_SEGMENT_END_DATE© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 26David M Walker 14 November 2009
  27. 27. Understanding The Conventions•! All Type tables have the same format –! Categorisation•! All Property tables have the same format –! Time Variant Attributes•! All Link tables have the same format –! Peer-To-Peer Relationships•! All Segment tables have the same format –! Peer Group Relationships•! There are no other significant clusters of data about a single entity such as Party© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 27David M Walker 14 November 2009
  28. 28. Introducing Major Entities•! Party is a Major Entity –! These are entities that exist regardless of the business process –! It is the relationships between major entities that are defined by business processes –! Major Entity attributes differ from one another•! All Organisations only need a finite number of major entities including: –! Campaign –! Asset –! Geography –! Account –! Channel –! Product/Service –! Electronic Address –! Contract –! Calendar© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 28David M Walker 14 November 2009
  29. 29. Data Models For Other Major Entities•! Geography –! Geography Types •! Postal Addresses, GPS Co-ordinates, ELR –! Geographic Property Types –! Geographic Properties –! Geographic Link Types –! Geographic Links –! Geographic Segment Types –! Geographic Segments•! and so on for every major entity© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 29David M Walker 14 November 2009
  30. 30. Major Entity Sub Model Major Entity Sub-Model MAJOR ENTITY MAJOR ENTITY PROPERTIES PROPERTY TYPES MAJOR ENTITY MAJOR ENTITY MAJOR ENTITY LINKS LINK TYPES MAJOR ENTITY MAJOR ENTITY MAJOR ENTITY TYPES SEGMENTS SEGMENT TYPES© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 30David M Walker 14 November 2009
  31. 31. Relationships Between Major Entities•! Storing names with multiple addresses and multiple electronic addresses (e-mail, telephone numbers, etc.) –! Billing, Contact, Home, Work, etc•! Usage –! Party -> Contract -> Account -> Electronic Address -> A Number -> Usage –! Party -> Contract -> Account -> Electronic Address -> B Number -> Usage –! Product/Service -> Tariff - Usage© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 31David M Walker 14 November 2009
  32. 32. Party -> (Electronic) Addresses PARTY_ ADDRESS_ HISTORY_ TYPES PARTY_ ADDRESS ADDRESS_ HISTORY PARTY PARTY_ ELECTRONIC ELECTRONIC_ADDRESS_ ADDRESS HISTORY PARTY_ ELECTRONIC_ADDRESS_ HISTORY_ TYPES© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 32David M Walker 14 November 2009
  33. 33. Party -> Usage (Simplified) PARTY CONTRACT ACCOUNTPRODUCT SERVICE PRODUCT ACCOUNTTARIFF SERVICE ELECTRONIC ADDRESSHISTORY & TYPE HISTORY & TYPE A Number TARIFF ELECTRONIC USAGE HISTORY ADDRESS B Number© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 33David M Walker 14 November 2009
  34. 34. Extending the Data Model•! Identify as many Major Entities as possible –! But remember there are only a finite number so don’t invent things for the sake of it•! Define the standard sub-model around them•! Put appropriate data in the sub-model•! Create the relationships to _HISTORY tables for the transaction the business wants to analyse© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 34David M Walker 14 November 2009
  35. 35. Does this help meet requirements?!! Uses A Design Pattern!! Convention Over Configuration!! DRY (Don’t Repeat Yourself)!! Static over a long period of time!! The data model should store data at the lowest possible level!! Supports the best use of platform specific features without compromising the design!! Completely time-variant!! Communication tool© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 35David M Walker 14 November 2009
  36. 36. Some Key Elements•! Self Similar modelling –! All _TYPE tables have the same structure, etc. –! Naming conventions are consistent everywhere –! Easy to create standard algorithms for load and extraction –! Easy to partition on type and/or date•! Insert ‘heavy’ / Update ‘light’ –! Most ETL will result in an insert, there will be very few updates•! Manages ‘Slowly Changing Dimensions’ –! Inherent in the Major Entity Sub-Model design –! Significantly reduces overhead in the Data Mart build•! Data Driven –! Types provide extensible metadata –! Prevents un-necessary updating of the data model itself•! Natural Star Schemas –! Histories will map to FACTS, –! Major Entity Collections will collapse into DIMENSIONS© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 36David M Walker 14 November 2008
  37. 37. Is this all there is to it ?•! At a high level – YES•! BUT: –! There are methods for dealing with data quality –! Special case methods for some lifetime attributes •! e.g. Handling women changing their names at marriage –! Insert/Update methods for performance –! Design Patterns for implementation –! Other detailed techniques•! This talk could only ever be: “An introduction to Process Neutral Data Modelling”© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 37David M Walker 14 November 2008
  38. 38. Further Reading•! Available From http://www.datamgmt.com•! White Papers –! Overview Architecture for Enterprise Data Warehouses •! March 2006 - 32 pages –! Data Warehouse Documentation Roadmap •! April 2007 – 28 pages –! How Data Works •! June 2007 – 32 Pages –! Data Warehouse Governance •! April 2007 – 24 Pages –! Data Warehouse Project Management •! October 2008 – 32 Pages –! Process Neutral Data Modelling© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 38David M Walker 14 November 2009
  39. 39. Data Management & Warehousing Thank you !! Website: http://www.datamgmt.com Phone: +44 7050 028 911 E-mail: davidw@datamgmt.com Skype/MSN: datamgmt© 2008 Data Management & Warehousing ETIS Community Gathering, Brussels Page 39David M Walker 14 November 2008

×