SlideShare a Scribd company logo
1 of 33
Download to read offline
© 2014 Genesee Academy, LLC
 Data Modeling  Data Vault Modeling  Big Data  Agile DW Ensemble Modeling Certification
CDVDM Recertification Event
Data Vault:
Then & Now
© 2014 Genesee Academy, LLC
USA +1 303 526 0340
Sweden 072 736 8700
Hans@GeneseeAcademy.com
www.GeneseeAcademy.com
CDVDM ReConnect 2014
gohansgo
© 2014 Genesee Academy, LLC 2
CDVDM ReConnect Event
© 2014 Genesee Academy, LLC 3
© 2014 Genesee Academy, LLC
Then & Now PresentationAgenda
• Looking Back & Progress
• Colors and Reverse Engineering
• Business Oriented Modeling
• Effective Dates
• Architecture Revisited
• Link Unique Specific Natural
• Thinking Differently
• Modeling Address
• Sourcing the Data Vault
• The L:L:L constructs
• Automation
Mini-Topics for 5x5 Updates
• Ensemble Modeling
• Core Business Concepts
• The Business Key
• Unit of Work & Possessive
• Raw versus Business
• Link & Why its not an Event
• Satellite & Why its not MV
• Big Data & Unstructured
• SuccessfulAgile DV DW
• Industry Reference Models
• Ensemble Forms
4
AGENDA ITEMS
© 2014 Genesee Academy, LLC 5
Then and Now…
2007 *2008 *2009 *2010*2011*2012 *2013 *2014
© 2014 Genesee Academy, LLC
Genesee Academy Activities
6
Seminars
Advising
Online
Conferences
© 2014 Genesee Academy, LLC
Genesee Academy Activities
38%
29%
17%
14%
GA Activities
Seminars
Advising
Online
Conferences
7
Genesee Academy, LLC
– World Class Training
• Seminars
– 1-4 day, on-location& in-company courses.
– Certifications issuedby GA.
– Blended(hybrid) Pedagogy.
• Advising
– DWBI Programs, Modeling Patterns, Enterprise
Architecture, Agility, etc.
– Reviews:Programs, Models, Architectures, etc.
• Online
– Classroomstudio, online, on-demandvideolessons.
– Multiple channels DVA andTrainOvation.
• Conferences
– Speaking, Presenting, andsometimescoordinating
industry conferencesaroundthe globe.
© 2014 Genesee Academy, LLC
Unified Decomposition™
8
• With the EDW, we seek to break things out into component parts for
flexibility, adaptability, agility, and generally to facilitate the capture of
things that are either interpreted in different ways or changing
independentlyof each other.
• At the same time a core premise of data warehousing is integration and
moving to a common standard view of unified concepts. So we also
want to tie things together – to Unify.
© 2014 Genesee Academy, LLC
Ensemble Modeling™
9
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole – an Ensemble.
• With Ensemble Modeling the Core Business Concepts that we define and
model are represented as a whole – an ensemble – including all of the
component parts. An Ensemble is based on all things defining a Core
Business Concept that can be uniquely and specifically said for one
instance of that Concept.
© 2014 Genesee Academy, LLC
The Data Vault Ensemble
10
• The Data Vault Ensemble conforms to a single key – embodied in the Hub
construct.
• The component parts for the Data Vault Ensemble include:
– Hub The Natural Business Key
– Link The Natural Business Relationships
– Satellite All Context, Descriptive Data and History
© 2014 Genesee Academy, LLC
Data Vault means thinkingdifferently
11
Customer
Customer
• The minimal constructthen for an “entity”
such as “Customer” is now a
Hub with a set of Satellites
© 2014 Genesee Academy, LLC
Data Vault means thinkingdifferently
12
Customer
Customer
© 2014 Genesee Academy, LLC
DV versus 3NF
Sat
Sat
SatSat
Sat
Sat
Sat
Sat
Sat
SatSatSat
13
EDWHistoryOperational
© 2014 Genesee Academy, LLC
The Data Vault modeling approach
• As the scope of the EDW is expanded and new data sources added, the
Data Vault can adapt to these changes without impacting the existing
model. This is what allows the EDW to be built incrementallyand to
adapt to change without the need for re-engineering.
New Area absorbed
14
H_Cust
H_Sale
H_Empl
H_Store
H_Car
© 2014 Genesee Academy, LLC
Data Vault Modeling Process
• The Modeling Process for creating a Data Vault model includes
three primary steps:
1) Identify and Model your Core Business Concepts
• Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Also find best/target Natural Business Key
2) Identify and Model your Natural Business Relationships
• Specific Unique Relationships
• Be considerate of the Unit of Work and Grain
3) Analyze and Design your Context Satellites
• Consider Rate of Change, Type of Data
and also the Sources of your
data during design process
15
© 2014 Genesee Academy, LLC 16
Anatomy of a Hub
© 2014 Genesee Academy, LLC 17
Anatomy of a Link
© 2014 Genesee Academy, LLC 18
Anatomy of a Satellite
© 2014 Genesee Academy, LLC
Sales DV Model - Backbone
19
SampleModel
© 2014 Genesee Academy, LLC
Sample: Sales Data Vault Model
20
© 2014 Genesee Academy, LLC
Identifying the Core Business Concepts
21
© 2014 Genesee Academy, LLC
Business Key?
• The Business Key that forms the basis of the Hub should be:
– Enterprise Wide Unique
– Central Business View Aligned
This means that:
– It is not a “Technical Key” but rather a “Business Key”
– It is not the source system primary key (id)
– It is not driven by any one source system
– Should be aligned with central business initiatives
In a data warehouse this means:
– Will have clashes
– Will have duplicates
22
© 2014 Genesee Academy, LLC
Starting with Stars
• Begins to get complicated…
Star 1
Reach complexity and lack of agility level…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
23
Accounting
Finance
Logistics
Sales
© 2014 Genesee Academy, LLC
Adapting & Expanding the EDW
• With Data Vault, scale easily – without re-engineering!
Star 1
Easily adapts to changes…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
EDWDV EDW
24
Accounting
Finance
Logistics
Sales
© 2014 Genesee Academy, LLC
FundamentalArchitecture
Data Mart
Star
Schema
Other Marts
& Error
Marts
Enterprise DWBI
Solution
Load
Transform
Calculate
Convert
Cleanse
Profile
Validate
Extract
Load
D/TStamp
Integrate
Extract
Staging
EDW
Transform
Calculate
Convert
Cleanse
Profile
Validate
Integrate
Raw BDW
* Integrate
* Align
* Reconcile
Mart Specific Rules
Common Business Rules
25
Data Mart
Star
Schema
© 2014 Genesee Academy, LLC
Identifying relationships that are really Ensembles
• Rules and Guidelines
• Does the Link have its own Business Key?
• Does the Link represent its own Core Business Concept?
• Are there several Satellites on the Link?
• Are there many attributes to describe the Link?
• Are there relationships (Link to Link) with this Link?
IF YES to any of these questions then the Link is Likely a Hub.
When a Link becomes a Hub
26
© 2014 Genesee Academy, LLC
Applying the Data Vault Ensemble
27
• Mixing “color types of data” is not Data Vaulting but
rather unvaulting
* A blended pattern has different dynamics
Thinking Differently
• Stay with the Ensemble Modeling Pattern. Continue practicing Unified
Decomposition. Continue Vaulting. Be aware when you change patterns.
Option 1 Option 2 Option 3
© 2014 Genesee Academy, LLC
Sourcing the Data Vault EDW
28
• Sourcing Data Vault requires more joins (Hub to Sats, 2 sides of Links)
• Sourcing Data Vault can be more efficient than sourcing other forms
• Primary path to efficient sourcing is thinking differently…
1. ETL team needs to understand the DV model to be efficient
2. Automation and templates for repeatable patterns make this easier
3. Pulling context fromsubset of Satellites eases this join impact
4. Hubs and Links are thin and short tables with no redundancy (fast)
5. Data Marts should not be based on creating another copy of DW
6. Data Mart design should be agile,purpose-built, and business driven
7. Data Marts should pass the virtualizationtest
8. Tune with PITS, Bridges,other Mart Stage views (& materialized)
© 2014 Genesee Academy, LLC
Link:Link:Link
29
• What does a L:L:L mean?
• Can a relationship have relationships to other relationships?
Whenever you see a Link:Link you should take a moment to find
the Hub you are missing. Either there or not yet modeled.
• Automation:
© 2014 Genesee Academy, LLC 30
Benefits of Data Vault Modeling
Agility Auditability History Scalability Simplicity Loadability
Responds Faster & Costs Less
© 2014 Genesee Academy, LLC
• Financial Institutions
• Telecommunications
• Retail
• Manufacturing
• Technology
• Energy & Utility
• HealthCare
• Consultancy
• Transportation
• Government
• Gaming
• Etc.
31
Applying Data Vault
© 2014 Genesee Academy, LLC 32
© 2014 Genesee Academy, LLC
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
Hans@GeneseeAcademy.com
gohansgo
Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
33
Online video-lesson training
DataVaultAcademy.com
DataVaultAcademy

More Related Content

Recently uploaded

一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 

Recently uploaded (20)

Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

2014 Data Vault ReConnect Event Then & Now DDVM

  • 1. © 2014 Genesee Academy, LLC  Data Modeling  Data Vault Modeling  Big Data  Agile DW Ensemble Modeling Certification CDVDM Recertification Event Data Vault: Then & Now © 2014 Genesee Academy, LLC USA +1 303 526 0340 Sweden 072 736 8700 Hans@GeneseeAcademy.com www.GeneseeAcademy.com CDVDM ReConnect 2014 gohansgo
  • 2. © 2014 Genesee Academy, LLC 2 CDVDM ReConnect Event
  • 3. © 2014 Genesee Academy, LLC 3
  • 4. © 2014 Genesee Academy, LLC Then & Now PresentationAgenda • Looking Back & Progress • Colors and Reverse Engineering • Business Oriented Modeling • Effective Dates • Architecture Revisited • Link Unique Specific Natural • Thinking Differently • Modeling Address • Sourcing the Data Vault • The L:L:L constructs • Automation Mini-Topics for 5x5 Updates • Ensemble Modeling • Core Business Concepts • The Business Key • Unit of Work & Possessive • Raw versus Business • Link & Why its not an Event • Satellite & Why its not MV • Big Data & Unstructured • SuccessfulAgile DV DW • Industry Reference Models • Ensemble Forms 4 AGENDA ITEMS
  • 5. © 2014 Genesee Academy, LLC 5 Then and Now… 2007 *2008 *2009 *2010*2011*2012 *2013 *2014
  • 6. © 2014 Genesee Academy, LLC Genesee Academy Activities 6 Seminars Advising Online Conferences
  • 7. © 2014 Genesee Academy, LLC Genesee Academy Activities 38% 29% 17% 14% GA Activities Seminars Advising Online Conferences 7 Genesee Academy, LLC – World Class Training • Seminars – 1-4 day, on-location& in-company courses. – Certifications issuedby GA. – Blended(hybrid) Pedagogy. • Advising – DWBI Programs, Modeling Patterns, Enterprise Architecture, Agility, etc. – Reviews:Programs, Models, Architectures, etc. • Online – Classroomstudio, online, on-demandvideolessons. – Multiple channels DVA andTrainOvation. • Conferences – Speaking, Presenting, andsometimescoordinating industry conferencesaroundthe globe.
  • 8. © 2014 Genesee Academy, LLC Unified Decomposition™ 8 • With the EDW, we seek to break things out into component parts for flexibility, adaptability, agility, and generally to facilitate the capture of things that are either interpreted in different ways or changing independentlyof each other. • At the same time a core premise of data warehousing is integration and moving to a common standard view of unified concepts. So we also want to tie things together – to Unify.
  • 9. © 2014 Genesee Academy, LLC Ensemble Modeling™ 9 All the parts of a thing taken together, so that each part is considered only in relation to the whole. • The constellation of component parts acts as a whole – an Ensemble. • With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. An Ensemble is based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept.
  • 10. © 2014 Genesee Academy, LLC The Data Vault Ensemble 10 • The Data Vault Ensemble conforms to a single key – embodied in the Hub construct. • The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History
  • 11. © 2014 Genesee Academy, LLC Data Vault means thinkingdifferently 11 Customer Customer • The minimal constructthen for an “entity” such as “Customer” is now a Hub with a set of Satellites
  • 12. © 2014 Genesee Academy, LLC Data Vault means thinkingdifferently 12 Customer Customer
  • 13. © 2014 Genesee Academy, LLC DV versus 3NF Sat Sat SatSat Sat Sat Sat Sat Sat SatSatSat 13 EDWHistoryOperational
  • 14. © 2014 Genesee Academy, LLC The Data Vault modeling approach • As the scope of the EDW is expanded and new data sources added, the Data Vault can adapt to these changes without impacting the existing model. This is what allows the EDW to be built incrementallyand to adapt to change without the need for re-engineering. New Area absorbed 14 H_Cust H_Sale H_Empl H_Store H_Car
  • 15. © 2014 Genesee Academy, LLC Data Vault Modeling Process • The Modeling Process for creating a Data Vault model includes three primary steps: 1) Identify and Model your Core Business Concepts • Business Interviews is at the heart of this step What do you do? What are the main things you work with? • Also find best/target Natural Business Key 2) Identify and Model your Natural Business Relationships • Specific Unique Relationships • Be considerate of the Unit of Work and Grain 3) Analyze and Design your Context Satellites • Consider Rate of Change, Type of Data and also the Sources of your data during design process 15
  • 16. © 2014 Genesee Academy, LLC 16 Anatomy of a Hub
  • 17. © 2014 Genesee Academy, LLC 17 Anatomy of a Link
  • 18. © 2014 Genesee Academy, LLC 18 Anatomy of a Satellite
  • 19. © 2014 Genesee Academy, LLC Sales DV Model - Backbone 19 SampleModel
  • 20. © 2014 Genesee Academy, LLC Sample: Sales Data Vault Model 20
  • 21. © 2014 Genesee Academy, LLC Identifying the Core Business Concepts 21
  • 22. © 2014 Genesee Academy, LLC Business Key? • The Business Key that forms the basis of the Hub should be: – Enterprise Wide Unique – Central Business View Aligned This means that: – It is not a “Technical Key” but rather a “Business Key” – It is not the source system primary key (id) – It is not driven by any one source system – Should be aligned with central business initiatives In a data warehouse this means: – Will have clashes – Will have duplicates 22
  • 23. © 2014 Genesee Academy, LLC Starting with Stars • Begins to get complicated… Star 1 Reach complexity and lack of agility level… Star 2 Star 3 Star 4 Star 5 Star 6 Star 7 Star 8 Star 9 Star 10 Star 11 Star n… 23 Accounting Finance Logistics Sales
  • 24. © 2014 Genesee Academy, LLC Adapting & Expanding the EDW • With Data Vault, scale easily – without re-engineering! Star 1 Easily adapts to changes… Star 2 Star 3 Star 4 Star 5 Star 6 Star 7 Star 8 Star 9 Star 10 Star 11 Star n… EDWDV EDW 24 Accounting Finance Logistics Sales
  • 25. © 2014 Genesee Academy, LLC FundamentalArchitecture Data Mart Star Schema Other Marts & Error Marts Enterprise DWBI Solution Load Transform Calculate Convert Cleanse Profile Validate Extract Load D/TStamp Integrate Extract Staging EDW Transform Calculate Convert Cleanse Profile Validate Integrate Raw BDW * Integrate * Align * Reconcile Mart Specific Rules Common Business Rules 25 Data Mart Star Schema
  • 26. © 2014 Genesee Academy, LLC Identifying relationships that are really Ensembles • Rules and Guidelines • Does the Link have its own Business Key? • Does the Link represent its own Core Business Concept? • Are there several Satellites on the Link? • Are there many attributes to describe the Link? • Are there relationships (Link to Link) with this Link? IF YES to any of these questions then the Link is Likely a Hub. When a Link becomes a Hub 26
  • 27. © 2014 Genesee Academy, LLC Applying the Data Vault Ensemble 27 • Mixing “color types of data” is not Data Vaulting but rather unvaulting * A blended pattern has different dynamics Thinking Differently • Stay with the Ensemble Modeling Pattern. Continue practicing Unified Decomposition. Continue Vaulting. Be aware when you change patterns. Option 1 Option 2 Option 3
  • 28. © 2014 Genesee Academy, LLC Sourcing the Data Vault EDW 28 • Sourcing Data Vault requires more joins (Hub to Sats, 2 sides of Links) • Sourcing Data Vault can be more efficient than sourcing other forms • Primary path to efficient sourcing is thinking differently… 1. ETL team needs to understand the DV model to be efficient 2. Automation and templates for repeatable patterns make this easier 3. Pulling context fromsubset of Satellites eases this join impact 4. Hubs and Links are thin and short tables with no redundancy (fast) 5. Data Marts should not be based on creating another copy of DW 6. Data Mart design should be agile,purpose-built, and business driven 7. Data Marts should pass the virtualizationtest 8. Tune with PITS, Bridges,other Mart Stage views (& materialized)
  • 29. © 2014 Genesee Academy, LLC Link:Link:Link 29 • What does a L:L:L mean? • Can a relationship have relationships to other relationships? Whenever you see a Link:Link you should take a moment to find the Hub you are missing. Either there or not yet modeled. • Automation:
  • 30. © 2014 Genesee Academy, LLC 30 Benefits of Data Vault Modeling Agility Auditability History Scalability Simplicity Loadability Responds Faster & Costs Less
  • 31. © 2014 Genesee Academy, LLC • Financial Institutions • Telecommunications • Retail • Manufacturing • Technology • Energy & Utility • HealthCare • Consultancy • Transportation • Government • Gaming • Etc. 31 Applying Data Vault
  • 32. © 2014 Genesee Academy, LLC 32
  • 33. © 2014 Genesee Academy, LLC Links and Information CDVDM Training & Certification www.GeneseeAcademy.com Hans@GeneseeAcademy.com gohansgo Book DataVaultBook.blogspot.com HansHultgren.WordPress.com HansHultgren 33 Online video-lesson training DataVaultAcademy.com DataVaultAcademy