Generative AI for Social Good at Open Data Science East 2024
Experiences from a Data Vault Pilot Exploiting the Internet of Things
1. Business Analytics For All
BA4All – Insight Session April 29th 2014
Guy Van der Sande – Vincent Greslebin
Unlocking the Value within
the Data Vault
2. Fifthplay : Architecture
Smart Homes Platform Data Warehouse
Data
Vault
Data
mart
Gebruikers
Marketing &
SSC
Utility Portal
ETL
Dag - 1
- Controle data kwaliteit
- Toepassing business
rules
- Aggregatie
- Filtering
Facility Portal
3. Fifthplay : Why Data Vault ?
• Pattern based design which allows agility to take place
• Easy to add new data sources making it future proof. This allows
Fifthplay to stay innovative
• Large volume of data
• Build up history that is not available in the operational system
• Possibility of performing analysis on raw data (cfr quality checks)
• Development speed (Pilot : 37 working days)
6. Data Vault ?
The Data Vault is a detail oriented, historical tracking and
uniquely linked set of normalized tables that support one or
more functional areas of business. It is a hybrid approach
encompassing the best of breed between 3rd normal form
(3NF) and star schema.
The design is flexible, scalable, consistent and adaptable to
the needs of the enterprise.
7. Standard architecture
The centerpiece of the Enterprise Data Warehouse
History is build-up
Granularity as ‘detailed’ as possible
No use of business rules
Use of business keys that are horizontal in nature and
provide visibility across lines of business
A new layer which has the benefits of the RAW Data
Vault, but with the business data embedded
In the Business Data Vault the data has been altered,
cleansed and changed to meet the business rules
Downstream of the raw data vault
Starting point for Master Data Management
Metadata is absolutely vital
8. The Data Vault Model exists of 3 basic entity types
• Hubs : contains a unique list of business keys
• Links : associations across or between business keys
• Satellites : holds descriptive data (about the business key) over time
Component parts of the Data Vault model
9. • Represents a Core Business Concept
• Is formed around the Business Key of this concept
• Is established the first time a new instance of that
business key is introduced
• Must be 1:1 with a single instance
• Consists of the business key, a sequence id, a load
date/time stamp and a record source.
Component parts - Hub
10. • Represents a natural business relationship between business keys
• Is established the first time this new unique association is presented
• Can represent an association between several Hubs and sometimes other
Links.
• maintains a 1:1 relationship with the unique and specific business defined
association between that set of keys.
• Consists of the sequence ids from the Hubs and Links
• Contains sequence id, a load date/time stamp and a
• record source.
Component parts - Link
11. • The Satellite contains the descriptive information
(context) for a business key.
• A Satellite can only describe one key (Hub or a Link).
• The Satellite is the only construct that manages time
slice data (data warehouse historical tracking of
values over time).
Component parts - Satellite
21. Fifthplay : Data Vault – lessons learned
• Don’t stop with data vault; A combination with classic
dimensional Kimball-methodology is advised
• Be creative; get out of your comfort zone, dare to walk
the thine line
• While setting up the data vault, operational issues
where discovered early in the process
• ETL-development goes very quickly because of the
typical pattern design of the data vault;
23. 2013 : Dan Linstedt
releases Data Vault
2.0 specs
History and what’s next ?
Relational modeling
(E.F.Codd)
Bill Inmon began
discussing Data
Warehousing
• Barry Devlin and
Dr Kimball
release
“Business Data
Warehouse”
• Bill Inmon
popularizes Data
Warehousing
• Dr Kimball
popularizes Star
Schema
Dan Linstedt begins
R&D on Data Vault
Modeling
Dan Linstedt
releases first 5
articles on Data
Vault Modeling
2012 : Dan Linstedt
announces Data
Vault 2.0
1960 1970 1980 1990 2000 2010
24. Thank You
BI@USGICT.be
http://www.linkedin.com/company/usgprofessionalsbe
+32 3 231 94 84 www.usgict.be
https://www.facebook.com/usgictbe
“In the Data Warehousing/BI world, we
should store the data as it stands on the
source system and interpret it on the
way out to the data marts. This is
absolutely critical to remember.”
Dan Linstedt
@BICC_at_USG