Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Data Vault Fundamentals &
Best Practices
1
Erik Fransen, managingconsultant
+31 6 159 444 76
@erikfransen
Agenda
• Introduction
• Data Vault Basics
• Benefits & Challenges
• Best practices: Automation & Data
Virtualization
• Rec...
• Founded in 1998, The Hague, NL
• 40+ consultants
• Business Intelligence, Data Vault, Datawarehousing,
Datawarehouse Aut...
The Data Vault modeling approach
Data Vault is a data modeling approach
…so it fits into the family of modeling approaches...
Forms of Ensemble Modeling
5
Why do we use Data Vault for DWH?
6
• When we need a DWH that supports:
– Integration
– Traceability
– History
– Increment...
The Data Vault Ensemble
7
• The Data Vault Ensemble conforms to a single key – embodied in the
Hub construct
• The parts f...
The Data Vault modeling approach
• As the scope of the EDW is expanded and new data sources added, the
Data Vault can adap...
• Business benefits
• Ability to adapt quickly to new business needs
• Data is traceable allowing for a fully auditable, i...
Data Vault Modeling Process
The Modeling Process for creating a Data Vault
model includes three primary steps:
1) Identify...
Getting data out of the Data Vault
• Problem:
– The Data Vault EDW is about data decomposition, data
registration and data...
Eliminate the need for physical data marts
No data replication
needed
Real-time data
refreshment
No redundant data
storage...
Virtual
13
SuperNova
Data	Model
Operational
Data	Model
Uniform	Data	Model
Data	Virtualization ”Physical”	Model
Virtual
App...
Wrap up
• Data Vault Basics:
– Hubs, Links, Satellites
– Integration, history, incremental modelling, agility
• Benefits:
...
Recommended	reading on	SuperNova
Free	download	http://www.cisco.com/web/services/enterprise-it-services/data-
virtualizati...
Recommend	reading	on	Data	Vault
Free	downloads	http://hanshultgren.wordpress.com/
16
Recommend	reading	on	Ensemble	&	Data	Vault
Modeling	the	Agile	Data	Warehouse	with	Data	Vault	
• Data	Vault	Modeling
• Agil...
Recommend	reading	on	Data	Virtualization
Data	Virtualization	in	Business	Intelligence	Architectures
• First	independent	bo...
Data Vault Training & Certification
• CDVDM: March 31, April 1 2016 Amsterdam
• DVD: March 2, 2016 Diegem
• www.centennium...
A short history on Data Vault
• 2002: First papers published by Dan Linstedt
• 2006: Start CDVDM certification program by ...
Upcoming SlideShare
Loading in …5
×

Data Vault Introduction

858 views

Published on

Data Vault Basics, Benefits and Challenges, presented by Erik Fransen (Centennium BI expertisehuis)

Published in: Data & Analytics
  • Be the first to comment

Data Vault Introduction

  1. 1. Data Vault Fundamentals & Best Practices 1 Erik Fransen, managingconsultant +31 6 159 444 76 @erikfransen
  2. 2. Agenda • Introduction • Data Vault Basics • Benefits & Challenges • Best practices: Automation & Data Virtualization • Recommended reading 2
  3. 3. • Founded in 1998, The Hague, NL • 40+ consultants • Business Intelligence, Data Vault, Datawarehousing, Datawarehouse Automation, Big Data, Data Virtualization • Business & technical consultancy, end-to-end implementation projects of Data Vault EDW, audits, training, certification • Wide range of customers (profit, non-profit) across various industries • Since 2009 Genesee Academy partner for Data Vault Day and Data Vault Certification in NL, B & D • Implementation partner of Cisco, MapR, Qlik & Tableau
  4. 4. The Data Vault modeling approach Data Vault is a data modeling approach …so it fits into the family of modeling approaches: 4 3rd Normal Form Ensemble Modeling Dimensional • While 3rd Normal Form is optimal for Operational Systems …and Dimensional is optimal for Data Marts …the Ensemble Modeling is optimal for the Datawarehouse • And Data Vault is the leading form of Ensemble Modeling
  5. 5. Forms of Ensemble Modeling 5
  6. 6. Why do we use Data Vault for DWH? 6 • When we need a DWH that supports: – Integration – Traceability – History – Incremental Build – Agility • Gracefully Adapts to New Sources • Full Auditability - Source to Mart • Enterprise View of Central Data • Ready for Automation Data Vault is specifically designed for modelling the EDW
  7. 7. The Data Vault Ensemble 7 • The Data Vault Ensemble conforms to a single key – embodied in the Hub construct • The parts for the Data Vault Ensemble only include: – Hubs The Natural Business Keys – Links The Natural Business Relationships – Satellite s All Context, Descriptive Data and History of Links and Hubs “Separating thingsthat change from things that don’t change”
  8. 8. The Data Vault modeling approach • As the scope of the EDW is expanded and new data sources added, the Data Vault can adapt to these changes without impacting the existing model • This is what allows the EDW to be built incrementally and to adapt to change without the need for re-engineering. New Area absorbed 8 H_Cust H_Sale H_Empl H_Store H_Car Tools for DWH Automation update the Data Vault EDW (model + data) in a fast, agile & consistent way
  9. 9. • Business benefits • Ability to adapt quickly to new business needs • Data is traceable allowing for a fully auditable, integrated data store • Allows the EDW to absorb all data all of the time • Easily adapts to new data sources and changing business rules – without expensive re- engineering • Results in an Data Warehouse with lower total cost of ownership (TCO) • Automation: short time to market, consist quality • Project/development benefits • Ideal for agile development techniques resulting in lower project risk and more frequent deliverables • Can be built incrementally without compromising the core architecture • Automation: fast and incremental sprints, predictable costs • Architectural benefits • Parallel loading • Data architecture that supports future expanded scope • Can scale to virtually any size • Ready for Automation: forces standardization Data Vault Benefits 9
  10. 10. Data Vault Modeling Process The Modeling Process for creating a Data Vault model includes three primary steps: 1) Identify and Model the Core Business Concepts • Business Interviews is at the heart of this step What do you do? What are the main things you work with? • Also find best/target Natural Business Key 2) Identify and Model the Natural Business Relationships • Specific Unique Relationships 3) Analyze and Design the Context Satellites • Consider Rate of Change, Type of Data and also the Sources of your data during design process 10 Ideally the data vault is modelled based on business processes and business concepts
  11. 11. Getting data out of the Data Vault • Problem: – The Data Vault EDW is about data decomposition, data registration and data integration – Data Vault is not intended, nor designed or optimized for data distribution and data consumption downstream the EDW – Leads typically to many complex physical data marts (high maintenance, high cost) • Solution: – Start thinking differently: focus on creating functional data products for the business – Stop loading and replicating data physically, start using data virtualization 11
  12. 12. Eliminate the need for physical data marts No data replication needed Real-time data refreshment No redundant data storage Simple updates of data models Simple queries Short Time to Market Automatic updates Lower storage costs High performance Ready for Big Data Data Vault EDW CRM ERP Weblog s … Productio n Data Data Copy Steering information SQL Data Virtualization Tool + Data Abstraction Layers No Data Copy at all 12
  13. 13. Virtual 13 SuperNova Data Model Operational Data Model Uniform Data Model Data Virtualization ”Physical” Model Virtual Application Layer Virtual “Physical” Layer Virtual Business Layer Web services Views Any other source data Data Layers for Data Virtualization Data Vault datawarehouse Automated step!
  14. 14. Wrap up • Data Vault Basics: – Hubs, Links, Satellites – Integration, history, incremental modelling, agility • Benefits: – Business, project, architecture – Make use of automation tools for fast, agile and consistent delivery • Challenges: – Data downstream the data vault EDW – Solution: use virtual data marts and automate SuperNova data models for reporting & analytics 14
  15. 15. Recommended reading on SuperNova Free download http://www.cisco.com/web/services/enterprise-it-services/data- virtualization/documents/whitepaper-cisco-datavaul.pdf 15
  16. 16. Recommend reading on Data Vault Free downloads http://hanshultgren.wordpress.com/ 16
  17. 17. Recommend reading on Ensemble & Data Vault Modeling the Agile Data Warehouse with Data Vault • Data Vault Modeling • Agile Data Warehousing BI • Enterprise Data Warehousing • Data Integration and DWBI Architecture • Unified Decomposition™ • Ensemble Modeling™ • A complete book on Data Vault • An Introduction, a Guide and a Reference • Modeling, Architecture & the Data Warehousing Program • Data & Semantic Integration for Enterprise Central Meaning • Applying Concepts to a successful Agile DWBI Program 17
  18. 18. Recommend reading on Data Virtualization Data Virtualization in Business Intelligence Architectures • First independent book on data virtualization that explains in a product-independent way how data virtualization technology works. • Illustrates concepts using examples developed with commercially available products. • Shows you how to solve common data integration challenges such as data quality, system interference, and overall performance by following practical guidelines on using data virtualization. • Apply data virtualization right away with three chapters full of practical implementation guidance. • Understand the big picture of data virtualization and its relationship with data governance and information management. 18
  19. 19. Data Vault Training & Certification • CDVDM: March 31, April 1 2016 Amsterdam • DVD: March 2, 2016 Diegem • www.centennium-opleidingen.nl • For all questions: opleidingen@centennium.nl 19
  20. 20. A short history on Data Vault • 2002: First papers published by Dan Linstedt • 2006: Start CDVDM certification program by Genesee Academy • 2007: Start of Data Vault EDW implementations – Primarily in Europe (NL, S), some in USA • 2008-2015: Several books published on DataVault by Dan Linstedt, Hans Hultgren and others • 2013: Data Vault on the radar in B, DACH, UK, USA, AUS, NZ, Asia • 2013: Data Vault EDW implementations going worldwide • 2015: Over 900 CDVDM professionals and 750+ Data Vault EDW worldwide 20

×