Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
“It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the m...
3:1 mileage<br />Expensive to buy<br />Slow….<br />Inflexible<br />Carbon based, polluting!<br />Need specialists to drive...
It can go anywhere<br />It can go anytime<br />It can be used by anyone<br />Very fast!!<br />Of course, extremely energy ...
Sorry, not more than 100 miles<br />Oh, charging takes like 5 hours<br />Batteries are somewhat expensive<br />Prone to (e...
Utilize new methods and technologies<br />Make it effective with todays legacy<br />Utilizing todays infrastructure<br />A...
What products are asked?<br />What are the quality characteristics?<br />How are these products made?<br />
What<br />
Product & Services<br />What<br />
What<br />Compliant<br />Adaptible<br />Sustainable<br />Decoupled<br />Centralized<br />Standardized<br />& Industrialize...
How<br />‘Calculating risk’<br />Source<br />‘Yield modules’<br />Source<br />‘Customer segmentation’<br />Semantic gap<br />
How<br />4. Generate (BI) products<br />3. Enrich and cleanse data<br />2. Register & (anchorize data)<br />1. Get the raw...
Recipient<br />End-user (Local)<br />4<br />4<br />4<br />4<br />4<br />Data & function service<br />3<br />3<br />3<br />...
Key Design Decisions<br />Adaptable<br />Sustainable<br />Compliant<br />Centralized<br />
Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />Centralized<br />
Key Design Decisions<br />Compliant<br />Adaptable<br />
View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence  Domain<br />4<br...
Sourcestore to BV<br />Sourcestore to product<br />Source to product<br />EDW (DV)<br />Adaptable<br />Sustainable<br />Co...
View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence  Domain<br />4<br...
Administrative process<br />Information Delivery Process<br />Decision- & control<br />Data & Information recipients<br />...
Offensive Governance<br />DefensiveGovernance<br />Factory Mode<br /><ul><li>System failure means immediate business loss
Performance of the systems has direct effect on    efficiency of users
 Most core business activities are ‘one line’
 Systems work is mostly maintenance
 Systems workprovideslittlestrategicdifferen-tation or dramaticcostreduction</li></ul>Strategic Mode<br /><ul><li>System f...
Performance of the systems has direct effect on   efficiency of users
 New systems promise major processand  service improvement
 New systems promise major costreduction
 New systems will close significant cost, service   or process performance withcompetiters</li></ul>Support Mode<br /><ul>...
 Performance of the system has no direct effect  on efficiency of users
 Company canquicklyrevertto manual   procedures
 Systems work is mostly maintenance</li></ul>Turnaround Mode<br /><ul><li>New systems introduce major processand  service ...
 New systems promise major costreductions
 New systems will close significant cost, service   or process performance withcompetiters
 IT constitutes more than 50% of capitalspending
 IT makes up more than 15% of total  corporate expenses</li></ul>Data Vault & Governance Strategy<br />Need for reliabilit...
Offensive Governance<br />Defensive Governance<br />Data Vault & Governance Strategy<br />Focus on:<br />AdaptabilityandAg...
Externalfocused
Competitative intelligence
Upcoming SlideShare
Loading in …5
×

Data Warehousing, Data Vault as evolutionary step

5,363 views

Published on

Presentation held on may 6th 2011 on the Advanced Dat Vault seminar in the Netherlands:

Abstract: Data Warehousing is still being ridiculized by popular literature and opportunistic vendors (and sometimes analysts) in the Business Intelligence domain - they tend to call it 'traditional' as opposed to their 'silver bullet technology'.

However, data warehousing has evolved and Data Vault - although undervalued by many - is fueling this evolution. Data Vault methodology enables architects to (finally) embed data warehousing into the Enterprise Architecture, something they struggled with the last 15 years. In the Netherlands, Data Vault sky rocketed innovation in the data warehousing scene. Accelerators in terms of frameworks and software are being build by experts in the field and several product vendors picked up on it. The presentation of Ronald Damhof will discuss briefly the evolution of data warehousing as it stands today, the position of Data Vault in the Enterprise Architecture, the different species/forks that exist in Data Vault and the automation that comes with it.

Published in: Technology, Business
  • Be the first to comment

Data Warehousing, Data Vault as evolutionary step

  1. 1. “It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change”<br /> Charles Darwin<br />Data Vault is anevolutionary step<br />Data Vaultfirmlypositioned data warehousing in EA<br />Data Vaultforksinto different species<br />Andyes, we can speed up evolution<br />
  2. 2.
  3. 3.
  4. 4. 3:1 mileage<br />Expensive to buy<br />Slow….<br />Inflexible<br />Carbon based, polluting!<br />Need specialists to drive<br />Need specialists to repair<br />
  5. 5. It can go anywhere<br />It can go anytime<br />It can be used by anyone<br />Very fast!!<br />Of course, extremely energy efficient<br />However….<br /> It won’t fly <br />
  6. 6. Sorry, not more than 100 miles<br />Oh, charging takes like 5 hours<br />Batteries are somewhat expensive<br />Prone to (electric) failure<br />No infrastructure yet<br />
  7. 7. Utilize new methods and technologies<br />Make it effective with todays legacy<br />Utilizing todays infrastructure<br />As well as adapting to new ones<br />Retain the good, get rid of the bad<br />Make it efficient, better mileage<br />Make it durable<br />Make it repeatable<br />Make it affordable<br />Make it Agile<br />Make it fit in the environment <br />
  8. 8. What products are asked?<br />What are the quality characteristics?<br />How are these products made?<br />
  9. 9. What<br />
  10. 10. Product & Services<br />What<br />
  11. 11. What<br />Compliant<br />Adaptible<br />Sustainable<br />Decoupled<br />Centralized<br />Standardized<br />& Industrialized<br />Effective<br />
  12. 12. How<br />‘Calculating risk’<br />Source<br />‘Yield modules’<br />Source<br />‘Customer segmentation’<br />Semantic gap<br />
  13. 13. How<br />4. Generate (BI) products<br />3. Enrich and cleanse data<br />2. Register & (anchorize data)<br />1. Get the raw (uncut) data<br />Information Delivery Proces<br />
  14. 14. Recipient<br />End-user (Local)<br />4<br />4<br />4<br />4<br />4<br />Data & function service<br />3<br />3<br />3<br />3<br />3<br />Information Delivery process<br />2<br />2<br />2<br />2<br />2<br />1<br />1<br />1<br />1<br />1<br />Generic BI proces (Central)<br />Data sources(internal & external)<br />
  15. 15. Key Design Decisions<br />Adaptable<br />Sustainable<br />Compliant<br />Centralized<br />
  16. 16. Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />Centralized<br />
  17. 17. Key Design Decisions<br />Compliant<br />Adaptable<br />
  18. 18. View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence Domain<br />4<br />Sources<br />BI apps<br />Reports<br />3<br />2<br />Source store<br />1”, 2”<br />Business View, Data feeds<br />BI AppsAnalysis<br />1<br />Enterprise Data Warehouse<br />BI Apps<br />Ad-hoc<br />Function, ‘How’<br />External sources<br />Data, ‘What’<br />‘Where’, ‘Whom’<br />
  19. 19. Sourcestore to BV<br />Sourcestore to product<br />Source to product<br />EDW (DV)<br />Adaptable<br />Sustainable<br />Compliant<br />Decoupled<br />Effective<br />Standardized<br />Centralized<br />
  20. 20. View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence Domain<br />4<br />Sources<br />BI apps<br />Reports<br />3<br />2<br />Source store<br />1”, 2”<br />Business View, Data feeds<br />BI AppsAnalysis<br />1<br />Enterprise Data Warehouse<br />BI Apps<br />Ad-hoc<br />Function, ‘How’<br />External sources<br />Data, ‘What’<br />‘Where’, ‘Whom’<br />
  21. 21. Administrative process<br />Information Delivery Process<br />Decision- & control<br />Data & Information recipients<br />Generate& Distribute<br />Enrich<br />Register (& anchorize)<br />Attain<br />Proces<br />PDCA<br />DV basedData Warehouse<br />Systems(internal &external)<br />Information products<br />Compliance reporting<br />Risk Management<br />Supply/Data<br />Demand/Function<br />Performance Management<br />Data products<br />Businessrules<br />Supply chain optimization<br />Staging<br />Fraud detection<br />Market basket analysis<br />Control / Metadata<br />
  22. 22. Offensive Governance<br />DefensiveGovernance<br />Factory Mode<br /><ul><li>System failure means immediate business loss
  23. 23. Performance of the systems has direct effect on efficiency of users
  24. 24. Most core business activities are ‘one line’
  25. 25. Systems work is mostly maintenance
  26. 26. Systems workprovideslittlestrategicdifferen-tation or dramaticcostreduction</li></ul>Strategic Mode<br /><ul><li>System failure means immediate business loss
  27. 27. Performance of the systems has direct effect on efficiency of users
  28. 28. New systems promise major processand service improvement
  29. 29. New systems promise major costreduction
  30. 30. New systems will close significant cost, service or process performance withcompetiters</li></ul>Support Mode<br /><ul><li>Even withrepeated service interruptions of upto 12 hours, there are no seriousconsequenses
  31. 31. Performance of the system has no direct effect on efficiency of users
  32. 32. Company canquicklyrevertto manual procedures
  33. 33. Systems work is mostly maintenance</li></ul>Turnaround Mode<br /><ul><li>New systems introduce major processand service transformations
  34. 34. New systems promise major costreductions
  35. 35. New systems will close significant cost, service or process performance withcompetiters
  36. 36. IT constitutes more than 50% of capitalspending
  37. 37. IT makes up more than 15% of total corporate expenses</li></ul>Data Vault & Governance Strategy<br />Need for reliability in IT<br />Need for innovation with IT<br />
  38. 38. Offensive Governance<br />Defensive Governance<br />Data Vault & Governance Strategy<br />Focus on:<br />AdaptabilityandAgility<br /><ul><li>Innovation
  39. 39. Externalfocused
  40. 40. Competitative intelligence
  41. 41. Flexible (limited) architecture</li></ul>Focus on:<br />Control and compliance.<br /><ul><li> Compliance toregulations
  42. 42. Control of assets
  43. 43. Relaiabilityand availability of:
  44. 44. Information
  45. 45. IT Services
  46. 46. Efficiency / Cost Control
  47. 47. Security
  48. 48. Avoid surprises
  49. 49. Manage risks
  50. 50. Project Portfolio
  51. 51. Stable Architecture</li></ul>Data<br /> Function<br />Need for reliability in IT<br />Need for innovation with IT<br />
  52. 52. Data Vault & Self ServiceThe development model<br />Central functiondevelopment<br />CentrallycoordinatedInfrastructuredevelopment<br />Gedelegeerde<br />Ontwikkeling<br />Localfunctiondevelopment<br />Localfunctiondevelopment<br />Selfservice<br />development<br />Delegateddevelopment<br />Selfservice<br />development<br />Delegateddevelopment<br />Function<br />(Opportunisticdevelopment)<br />Data<br />(Systematic<br />development)<br /> Data<br />CentrallycoordinatedICT development<br />
  53. 53. Data Vault Species<br />
  54. 54. 1 - Classic Data Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Structure transformation<br />Hub = business keys<br />Business rule execution<br />Structure and value transformation<br />Standardized<br />Centralized<br />Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />?<br />?<br />
  55. 55. 2 - Source Data Vault<br />Business Data Vault<br />Staging Vault<br />Business Transaction System <br />Data Marts<br />Business Transaction System <br />Staging Vault<br />Structure transformation<br />No integration, Hub=surrogate keys<br />Persisting staging in DV format<br />Business rule execution<br />Integration<br />DV modelled<br />Structure transformation<br />Standardized<br />Centralized<br />Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />?<br />?<br />?<br />
  56. 56. Source<br />Source<br /> 100% Semantic gap<br />Business DV<br />Source<br />Staging DV<br />Source<br />Staging DV<br />100% Semantic gap<br />Still the source<br />Integration, cleansing, consolidation<br />Business rule execution upstream ??<br />DV modelled<br />
  57. 57. Source<br />Source<br /> 100% Semantic gap<br />Data Warehouse<br />Business DV<br />Source<br />Source<br />Staging DV<br />Source<br />Source<br />Staging DV<br />100% Semantic gap<br />Still the source<br />Integration, cleansing, consolidation<br />Business rule execution upstream ??<br />DV modelled<br />
  58. 58. When choose what<br /><ul><li>Factors that contribute to (1)
  59. 59. Information interdepence
  60. 60. Sponsorship level
  61. 61. Strategic View (decoupled)
  62. 62. Information maturity; conceptual business process knowledge present
  63. 63. Scale and complexity of data and/or organization
  64. 64. Relatively big semantic gap between Source  Requirement
  65. 65. Factors that contribute to (2)
  66. 66. Scale and complexity of data and/or organization
  67. 67. Bad data quality
  68. 68. Business keys hard to identify
  69. 69. Information maturity; no conceptual business process knowledge present
  70. 70. Urgency
  71. 71. Resource constraints
  72. 72. Small semantic gap between Source  Requirement</li></li></ul><li>Federated Data Vaults?Use operating models<br />1<br />1<br />1<br />1b<br />
  73. 73. 1b – Classic Data Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Structure transformation<br />Light integration on the business keys<br />Specific business rule execution<br />Structure and value transformation<br />Consolidation<br />
  74. 74. Speeding up Data Vault Evolution?<br />
  75. 75. Data Vault Innovation<br />
  76. 76.
  77. 77. My PoV<br /><ul><li>Generation is an aid, not a goal in itself
  78. 78. 80-20 rule
  79. 79. Not everything can be generated, be real
  80. 80. Truly understand the mechanics
  81. 81. Do not underestimate the complexity
  82. 82. It is a pretty steep learning cycle
  83. 83. PoC, PoC, PoC
  84. 84. Make it yourself
  85. 85. Then “generate”
  86. 86. Generation software only in combo with modeling software and ETL software
  87. 87. From staging to DV  ETL tool is overhead?</li></li></ul><li>Metamodel driven automation<br /><ul><li>Models (process, rules and data) determine the metadata, the metadata determines the automation artifacts
  88. 88. Aim is to be 100% declarative
  89. 89. It can not be generated all, specific tailored metadata will remain necessary</li></ul>Data Vault implementations<br />Metadata driven automation- Inputs: Source model(s), target model, Template Design, Naming conventions<br />- Advanced inputs: Normalization preferences, Ontologies<br />Taken from Dan Linstedt’sblog post: http://danlinstedt.com/datavaultcat/code-generation-for-data-vault-not-as-easy-as-you-think/<br />Template driven automation<br /><ul><li>In the most basic forms; documentation - describing a pattern
  90. 90. More advanced; generating XML code for 2nd gen. ETL tooling
  91. 91. Vb - http://www.grundsatzlich-it.nl/bi-tools-templator.html</li></li></ul><li>Automation typology<br />Thosethat support species #1 (building a Source Vault)<br />Template driven or Metadatadriven<br />Oftengenerates the model and the logistics<br />Thosethat support species #2 (building a Classic Vault)<br />Template driven or Metadatadriven<br />Generate (metadata of) the logistics<br />Modelingremains a craft– IDENTIFY THE BUSINESS KEYS<br />Thosethat go beyond<br />Metamodel driven<br />Based on the business process, the rulesand the data<br />The datamodel (DV, AM, ..) is a consequence of the process<br />Support for ALM characteristics<br />
  92. 92. Next evolutionary step?<br /><ul><li>Both DV species will remain viable
  93. 93. Both DV species will live next to others
  94. 94. Technology will push the envelope
  95. 95. Automation typologies will be pushed even more towards metadata driven automation
  96. 96. WE – the people – need to evolve:
  97. 97. We need more talented engineers
  98. 98. We need more focused education
  99. 99. We need more knowledge sharing
  100. 100. We need more academic research</li></li></ul><li>Thank You<br />Ronald Damhofis an independent practitioner in the field of data management and decision support. Graduated in 1995 in the study of Economics. Since 1995 he worked as a practitioner into the field of Information Management with a focus on decision support and data management, trying hard to enhance the rigor and relevance in these fields by combining scientific research with the everyday challenges of the practitioner. Ronald is mainly hired by customers in the role of business/IT architect, auditor, coach & trainer. He blogs on B-Eye-Network.com, is a member of the prestigious BBBT, wrote several articles regarding decision support architectures and is a researcher in the field of Information Management. Although Ronald likes to work with theoretical grounded research and proven practices, Ronald is not a 'white paper' architect; put your money where your mouth is, is his motto. He likes to see architectures 'live' in enterprises, not just write about it. In most organizations his role extends architecture often. He needs to be a missionary (selling the value of an information architecture), a project manager (getting it done) and a specialist (educating hardware peeps, data architects, data logistics etc.). But that’s the beauty of the trade... <br />

×