2014 Data Vault ReConnect Event Then & Now DDVM1. © 2014 Genesee Academy, LLC
Data Modeling Data Vault Modeling Big Data Agile DW Ensemble Modeling Certification
CDVDM Recertification Event
Data Vault:
Then & Now
© 2014 Genesee Academy, LLC
USA +1 303 526 0340
Sweden 072 736 8700
Hans@GeneseeAcademy.com
www.GeneseeAcademy.com
CDVDM ReConnect 2014
gohansgo
4. © 2014 Genesee Academy, LLC
Then & Now PresentationAgenda
• Looking Back & Progress
• Colors and Reverse Engineering
• Business Oriented Modeling
• Effective Dates
• Architecture Revisited
• Link Unique Specific Natural
• Thinking Differently
• Modeling Address
• Sourcing the Data Vault
• The L:L:L constructs
• Automation
Mini-Topics for 5x5 Updates
• Ensemble Modeling
• Core Business Concepts
• The Business Key
• Unit of Work & Possessive
• Raw versus Business
• Link & Why its not an Event
• Satellite & Why its not MV
• Big Data & Unstructured
• SuccessfulAgile DV DW
• Industry Reference Models
• Ensemble Forms
4
AGENDA ITEMS
5. © 2014 Genesee Academy, LLC 5
Then and Now…
2007 *2008 *2009 *2010*2011*2012 *2013 *2014
6. © 2014 Genesee Academy, LLC
Genesee Academy Activities
6
Seminars
Advising
Online
Conferences
7. © 2014 Genesee Academy, LLC
Genesee Academy Activities
38%
29%
17%
14%
GA Activities
Seminars
Advising
Online
Conferences
7
Genesee Academy, LLC
– World Class Training
• Seminars
– 1-4 day, on-location& in-company courses.
– Certifications issuedby GA.
– Blended(hybrid) Pedagogy.
• Advising
– DWBI Programs, Modeling Patterns, Enterprise
Architecture, Agility, etc.
– Reviews:Programs, Models, Architectures, etc.
• Online
– Classroomstudio, online, on-demandvideolessons.
– Multiple channels DVA andTrainOvation.
• Conferences
– Speaking, Presenting, andsometimescoordinating
industry conferencesaroundthe globe.
8. © 2014 Genesee Academy, LLC
Unified Decomposition™
8
• With the EDW, we seek to break things out into component parts for
flexibility, adaptability, agility, and generally to facilitate the capture of
things that are either interpreted in different ways or changing
independentlyof each other.
• At the same time a core premise of data warehousing is integration and
moving to a common standard view of unified concepts. So we also
want to tie things together – to Unify.
9. © 2014 Genesee Academy, LLC
Ensemble Modeling™
9
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole – an Ensemble.
• With Ensemble Modeling the Core Business Concepts that we define and
model are represented as a whole – an ensemble – including all of the
component parts. An Ensemble is based on all things defining a Core
Business Concept that can be uniquely and specifically said for one
instance of that Concept.
10. © 2014 Genesee Academy, LLC
The Data Vault Ensemble
10
• The Data Vault Ensemble conforms to a single key – embodied in the Hub
construct.
• The component parts for the Data Vault Ensemble include:
– Hub The Natural Business Key
– Link The Natural Business Relationships
– Satellite All Context, Descriptive Data and History
11. © 2014 Genesee Academy, LLC
Data Vault means thinkingdifferently
11
Customer
Customer
• The minimal constructthen for an “entity”
such as “Customer” is now a
Hub with a set of Satellites
12. © 2014 Genesee Academy, LLC
Data Vault means thinkingdifferently
12
Customer
Customer
13. © 2014 Genesee Academy, LLC
DV versus 3NF
Sat
Sat
SatSat
Sat
Sat
Sat
Sat
Sat
SatSatSat
13
EDWHistoryOperational
14. © 2014 Genesee Academy, LLC
The Data Vault modeling approach
• As the scope of the EDW is expanded and new data sources added, the
Data Vault can adapt to these changes without impacting the existing
model. This is what allows the EDW to be built incrementallyand to
adapt to change without the need for re-engineering.
New Area absorbed
14
H_Cust
H_Sale
H_Empl
H_Store
H_Car
15. © 2014 Genesee Academy, LLC
Data Vault Modeling Process
• The Modeling Process for creating a Data Vault model includes
three primary steps:
1) Identify and Model your Core Business Concepts
• Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Also find best/target Natural Business Key
2) Identify and Model your Natural Business Relationships
• Specific Unique Relationships
• Be considerate of the Unit of Work and Grain
3) Analyze and Design your Context Satellites
• Consider Rate of Change, Type of Data
and also the Sources of your
data during design process
15
19. © 2014 Genesee Academy, LLC
Sales DV Model - Backbone
19
SampleModel
21. © 2014 Genesee Academy, LLC
Identifying the Core Business Concepts
21
22. © 2014 Genesee Academy, LLC
Business Key?
• The Business Key that forms the basis of the Hub should be:
– Enterprise Wide Unique
– Central Business View Aligned
This means that:
– It is not a “Technical Key” but rather a “Business Key”
– It is not the source system primary key (id)
– It is not driven by any one source system
– Should be aligned with central business initiatives
In a data warehouse this means:
– Will have clashes
– Will have duplicates
22
23. © 2014 Genesee Academy, LLC
Starting with Stars
• Begins to get complicated…
Star 1
Reach complexity and lack of agility level…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
23
Accounting
Finance
Logistics
Sales
24. © 2014 Genesee Academy, LLC
Adapting & Expanding the EDW
• With Data Vault, scale easily – without re-engineering!
Star 1
Easily adapts to changes…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
EDWDV EDW
24
Accounting
Finance
Logistics
Sales
25. © 2014 Genesee Academy, LLC
FundamentalArchitecture
Data Mart
Star
Schema
Other Marts
& Error
Marts
Enterprise DWBI
Solution
Load
Transform
Calculate
Convert
Cleanse
Profile
Validate
Extract
Load
D/TStamp
Integrate
Extract
Staging
EDW
Transform
Calculate
Convert
Cleanse
Profile
Validate
Integrate
Raw BDW
* Integrate
* Align
* Reconcile
Mart Specific Rules
Common Business Rules
25
Data Mart
Star
Schema
26. © 2014 Genesee Academy, LLC
Identifying relationships that are really Ensembles
• Rules and Guidelines
• Does the Link have its own Business Key?
• Does the Link represent its own Core Business Concept?
• Are there several Satellites on the Link?
• Are there many attributes to describe the Link?
• Are there relationships (Link to Link) with this Link?
IF YES to any of these questions then the Link is Likely a Hub.
When a Link becomes a Hub
26
27. © 2014 Genesee Academy, LLC
Applying the Data Vault Ensemble
27
• Mixing “color types of data” is not Data Vaulting but
rather unvaulting
* A blended pattern has different dynamics
Thinking Differently
• Stay with the Ensemble Modeling Pattern. Continue practicing Unified
Decomposition. Continue Vaulting. Be aware when you change patterns.
Option 1 Option 2 Option 3
28. © 2014 Genesee Academy, LLC
Sourcing the Data Vault EDW
28
• Sourcing Data Vault requires more joins (Hub to Sats, 2 sides of Links)
• Sourcing Data Vault can be more efficient than sourcing other forms
• Primary path to efficient sourcing is thinking differently…
1. ETL team needs to understand the DV model to be efficient
2. Automation and templates for repeatable patterns make this easier
3. Pulling context fromsubset of Satellites eases this join impact
4. Hubs and Links are thin and short tables with no redundancy (fast)
5. Data Marts should not be based on creating another copy of DW
6. Data Mart design should be agile,purpose-built, and business driven
7. Data Marts should pass the virtualizationtest
8. Tune with PITS, Bridges,other Mart Stage views (& materialized)
29. © 2014 Genesee Academy, LLC
Link:Link:Link
29
• What does a L:L:L mean?
• Can a relationship have relationships to other relationships?
Whenever you see a Link:Link you should take a moment to find
the Hub you are missing. Either there or not yet modeled.
• Automation:
30. © 2014 Genesee Academy, LLC 30
Benefits of Data Vault Modeling
Agility Auditability History Scalability Simplicity Loadability
Responds Faster & Costs Less
31. © 2014 Genesee Academy, LLC
• Financial Institutions
• Telecommunications
• Retail
• Manufacturing
• Technology
• Energy & Utility
• HealthCare
• Consultancy
• Transportation
• Government
• Gaming
• Etc.
31
Applying Data Vault
33. © 2014 Genesee Academy, LLC
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
Hans@GeneseeAcademy.com
gohansgo
Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
33
Online video-lesson training
DataVaultAcademy.com
DataVaultAcademy