“It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the m...
3:1 mileage<br />Expensive to buy<br />Slow….<br />Inflexible<br />Carbon based, polluting!<br />Need specialists to drive...
It can go anywhere<br />It can go anytime<br />It can be used by anyone<br />Very fast!!<br />Of course, extremely energy ...
Sorry, not more than 100 miles<br />Oh, charging takes like 5 hours<br />Batteries are somewhat expensive<br />Prone to (e...
Utilize new methods and technologies<br />Make it effective with todays legacy<br />Utilizing todays infrastructure<br />A...
What products are asked?<br />What are the quality characteristics?<br />How are these products made?<br />
What<br />
Product & Services<br />What<br />
What<br />Compliant<br />Adaptible<br />Sustainable<br />Decoupled<br />Centralized<br />Standardized<br />& Industrialize...
How<br />‘Calculating risk’<br />Source<br />‘Yield modules’<br />Source<br />‘Customer segmentation’<br />Semantic gap<br />
How<br />4. Generate (BI) products<br />3. Enrich and cleanse data<br />2. Register & (anchorize data)<br />1. Get the raw...
Recipient<br />End-user (Local)<br />4<br />4<br />4<br />4<br />4<br />Data & function service<br />3<br />3<br />3<br />...
Key Design Decisions<br />Adaptable<br />Sustainable<br />Compliant<br />Centralized<br />
Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />Centralized<br />
Key Design Decisions<br />Compliant<br />Adaptable<br />
View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence  Domain<br />4<br...
Sourcestore to BV<br />Sourcestore to product<br />Source to product<br />EDW (DV)<br />Adaptable<br />Sustainable<br />Co...
View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence  Domain<br />4<br...
Administrative process<br />Information Delivery Process<br />Decision- & control<br />Data & Information recipients<br />...
Offensive Governance<br />DefensiveGovernance<br />Factory Mode<br /><ul><li>System failure means immediate business loss
Performance of the systems has direct effect on    efficiency of users
 Most core business activities are ‘one line’
 Systems work is mostly maintenance
 Systems workprovideslittlestrategicdifferen-tation or dramaticcostreduction</li></ul>Strategic Mode<br /><ul><li>System f...
Performance of the systems has direct effect on   efficiency of users
 New systems promise major processand  service improvement
 New systems promise major costreduction
 New systems will close significant cost, service   or process performance withcompetiters</li></ul>Support Mode<br /><ul>...
 Performance of the system has no direct effect  on efficiency of users
 Company canquicklyrevertto manual   procedures
 Systems work is mostly maintenance</li></ul>Turnaround Mode<br /><ul><li>New systems introduce major processand  service ...
 New systems promise major costreductions
 New systems will close significant cost, service   or process performance withcompetiters
 IT constitutes more than 50% of capitalspending
 IT makes up more than 15% of total  corporate expenses</li></ul>Data Vault & Governance Strategy<br />Need for reliabilit...
Offensive Governance<br />Defensive Governance<br />Data Vault & Governance Strategy<br />Focus on:<br />AdaptabilityandAg...
Externalfocused
Competitative intelligence
Upcoming SlideShare
Loading in …5
×

Data Warehousing, Data Vault as evolutionary step

5,173 views

Published on

Presentation held on may 6th 2011 on the Advanced Dat Vault seminar in the Netherlands:

Abstract: Data Warehousing is still being ridiculized by popular literature and opportunistic vendors (and sometimes analysts) in the Business Intelligence domain - they tend to call it 'traditional' as opposed to their 'silver bullet technology'.

However, data warehousing has evolved and Data Vault - although undervalued by many - is fueling this evolution. Data Vault methodology enables architects to (finally) embed data warehousing into the Enterprise Architecture, something they struggled with the last 15 years. In the Netherlands, Data Vault sky rocketed innovation in the data warehousing scene. Accelerators in terms of frameworks and software are being build by experts in the field and several product vendors picked up on it. The presentation of Ronald Damhof will discuss briefly the evolution of data warehousing as it stands today, the position of Data Vault in the Enterprise Architecture, the different species/forks that exist in Data Vault and the automation that comes with it.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,173
On SlideShare
0
From Embeds
0
Number of Embeds
64
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • Shared
  • Data Warehousing, Data Vault as evolutionary step

    1. 1. “It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change”<br /> Charles Darwin<br />Data Vault is anevolutionary step<br />Data Vaultfirmlypositioned data warehousing in EA<br />Data Vaultforksinto different species<br />Andyes, we can speed up evolution<br />
    2. 2.
    3. 3.
    4. 4. 3:1 mileage<br />Expensive to buy<br />Slow….<br />Inflexible<br />Carbon based, polluting!<br />Need specialists to drive<br />Need specialists to repair<br />
    5. 5. It can go anywhere<br />It can go anytime<br />It can be used by anyone<br />Very fast!!<br />Of course, extremely energy efficient<br />However….<br /> It won’t fly <br />
    6. 6. Sorry, not more than 100 miles<br />Oh, charging takes like 5 hours<br />Batteries are somewhat expensive<br />Prone to (electric) failure<br />No infrastructure yet<br />
    7. 7. Utilize new methods and technologies<br />Make it effective with todays legacy<br />Utilizing todays infrastructure<br />As well as adapting to new ones<br />Retain the good, get rid of the bad<br />Make it efficient, better mileage<br />Make it durable<br />Make it repeatable<br />Make it affordable<br />Make it Agile<br />Make it fit in the environment <br />
    8. 8. What products are asked?<br />What are the quality characteristics?<br />How are these products made?<br />
    9. 9. What<br />
    10. 10. Product & Services<br />What<br />
    11. 11. What<br />Compliant<br />Adaptible<br />Sustainable<br />Decoupled<br />Centralized<br />Standardized<br />& Industrialized<br />Effective<br />
    12. 12. How<br />‘Calculating risk’<br />Source<br />‘Yield modules’<br />Source<br />‘Customer segmentation’<br />Semantic gap<br />
    13. 13. How<br />4. Generate (BI) products<br />3. Enrich and cleanse data<br />2. Register & (anchorize data)<br />1. Get the raw (uncut) data<br />Information Delivery Proces<br />
    14. 14. Recipient<br />End-user (Local)<br />4<br />4<br />4<br />4<br />4<br />Data & function service<br />3<br />3<br />3<br />3<br />3<br />Information Delivery process<br />2<br />2<br />2<br />2<br />2<br />1<br />1<br />1<br />1<br />1<br />Generic BI proces (Central)<br />Data sources(internal & external)<br />
    15. 15. Key Design Decisions<br />Adaptable<br />Sustainable<br />Compliant<br />Centralized<br />
    16. 16. Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />Centralized<br />
    17. 17. Key Design Decisions<br />Compliant<br />Adaptable<br />
    18. 18. View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence Domain<br />4<br />Sources<br />BI apps<br />Reports<br />3<br />2<br />Source store<br />1”, 2”<br />Business View, Data feeds<br />BI AppsAnalysis<br />1<br />Enterprise Data Warehouse<br />BI Apps<br />Ad-hoc<br />Function, ‘How’<br />External sources<br />Data, ‘What’<br />‘Where’, ‘Whom’<br />
    19. 19. Sourcestore to BV<br />Sourcestore to product<br />Source to product<br />EDW (DV)<br />Adaptable<br />Sustainable<br />Compliant<br />Decoupled<br />Effective<br />Standardized<br />Centralized<br />
    20. 20. View: Component view<br />1<br />2<br />3<br />4<br />Company xxx data warehouse & Business Intelligence Domain<br />4<br />Sources<br />BI apps<br />Reports<br />3<br />2<br />Source store<br />1”, 2”<br />Business View, Data feeds<br />BI AppsAnalysis<br />1<br />Enterprise Data Warehouse<br />BI Apps<br />Ad-hoc<br />Function, ‘How’<br />External sources<br />Data, ‘What’<br />‘Where’, ‘Whom’<br />
    21. 21. Administrative process<br />Information Delivery Process<br />Decision- & control<br />Data & Information recipients<br />Generate& Distribute<br />Enrich<br />Register (& anchorize)<br />Attain<br />Proces<br />PDCA<br />DV basedData Warehouse<br />Systems(internal &external)<br />Information products<br />Compliance reporting<br />Risk Management<br />Supply/Data<br />Demand/Function<br />Performance Management<br />Data products<br />Businessrules<br />Supply chain optimization<br />Staging<br />Fraud detection<br />Market basket analysis<br />Control / Metadata<br />
    22. 22. Offensive Governance<br />DefensiveGovernance<br />Factory Mode<br /><ul><li>System failure means immediate business loss
    23. 23. Performance of the systems has direct effect on efficiency of users
    24. 24. Most core business activities are ‘one line’
    25. 25. Systems work is mostly maintenance
    26. 26. Systems workprovideslittlestrategicdifferen-tation or dramaticcostreduction</li></ul>Strategic Mode<br /><ul><li>System failure means immediate business loss
    27. 27. Performance of the systems has direct effect on efficiency of users
    28. 28. New systems promise major processand service improvement
    29. 29. New systems promise major costreduction
    30. 30. New systems will close significant cost, service or process performance withcompetiters</li></ul>Support Mode<br /><ul><li>Even withrepeated service interruptions of upto 12 hours, there are no seriousconsequenses
    31. 31. Performance of the system has no direct effect on efficiency of users
    32. 32. Company canquicklyrevertto manual procedures
    33. 33. Systems work is mostly maintenance</li></ul>Turnaround Mode<br /><ul><li>New systems introduce major processand service transformations
    34. 34. New systems promise major costreductions
    35. 35. New systems will close significant cost, service or process performance withcompetiters
    36. 36. IT constitutes more than 50% of capitalspending
    37. 37. IT makes up more than 15% of total corporate expenses</li></ul>Data Vault & Governance Strategy<br />Need for reliability in IT<br />Need for innovation with IT<br />
    38. 38. Offensive Governance<br />Defensive Governance<br />Data Vault & Governance Strategy<br />Focus on:<br />AdaptabilityandAgility<br /><ul><li>Innovation
    39. 39. Externalfocused
    40. 40. Competitative intelligence
    41. 41. Flexible (limited) architecture</li></ul>Focus on:<br />Control and compliance.<br /><ul><li> Compliance toregulations
    42. 42. Control of assets
    43. 43. Relaiabilityand availability of:
    44. 44. Information
    45. 45. IT Services
    46. 46. Efficiency / Cost Control
    47. 47. Security
    48. 48. Avoid surprises
    49. 49. Manage risks
    50. 50. Project Portfolio
    51. 51. Stable Architecture</li></ul>Data<br /> Function<br />Need for reliability in IT<br />Need for innovation with IT<br />
    52. 52. Data Vault & Self ServiceThe development model<br />Central functiondevelopment<br />CentrallycoordinatedInfrastructuredevelopment<br />Gedelegeerde<br />Ontwikkeling<br />Localfunctiondevelopment<br />Localfunctiondevelopment<br />Selfservice<br />development<br />Delegateddevelopment<br />Selfservice<br />development<br />Delegateddevelopment<br />Function<br />(Opportunisticdevelopment)<br />Data<br />(Systematic<br />development)<br /> Data<br />CentrallycoordinatedICT development<br />
    53. 53. Data Vault Species<br />
    54. 54. 1 - Classic Data Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Structure transformation<br />Hub = business keys<br />Business rule execution<br />Structure and value transformation<br />Standardized<br />Centralized<br />Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />?<br />?<br />
    55. 55. 2 - Source Data Vault<br />Business Data Vault<br />Staging Vault<br />Business Transaction System <br />Data Marts<br />Business Transaction System <br />Staging Vault<br />Structure transformation<br />No integration, Hub=surrogate keys<br />Persisting staging in DV format<br />Business rule execution<br />Integration<br />DV modelled<br />Structure transformation<br />Standardized<br />Centralized<br />Adaptable<br />Effectiveness<br />Sustainable<br />Decoupled<br />Compliant<br />?<br />?<br />?<br />
    56. 56. Source<br />Source<br /> 100% Semantic gap<br />Business DV<br />Source<br />Staging DV<br />Source<br />Staging DV<br />100% Semantic gap<br />Still the source<br />Integration, cleansing, consolidation<br />Business rule execution upstream ??<br />DV modelled<br />
    57. 57. Source<br />Source<br /> 100% Semantic gap<br />Data Warehouse<br />Business DV<br />Source<br />Source<br />Staging DV<br />Source<br />Source<br />Staging DV<br />100% Semantic gap<br />Still the source<br />Integration, cleansing, consolidation<br />Business rule execution upstream ??<br />DV modelled<br />
    58. 58. When choose what<br /><ul><li>Factors that contribute to (1)
    59. 59. Information interdepence
    60. 60. Sponsorship level
    61. 61. Strategic View (decoupled)
    62. 62. Information maturity; conceptual business process knowledge present
    63. 63. Scale and complexity of data and/or organization
    64. 64. Relatively big semantic gap between Source  Requirement
    65. 65. Factors that contribute to (2)
    66. 66. Scale and complexity of data and/or organization
    67. 67. Bad data quality
    68. 68. Business keys hard to identify
    69. 69. Information maturity; no conceptual business process knowledge present
    70. 70. Urgency
    71. 71. Resource constraints
    72. 72. Small semantic gap between Source  Requirement</li></li></ul><li>Federated Data Vaults?Use operating models<br />1<br />1<br />1<br />1b<br />
    73. 73. 1b – Classic Data Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Business Transaction System <br />Data Vault<br />Data Marts<br />Staging Out<br />Business Transaction System <br />Generic Business Rules<br />Rule Vault<br />Structure transformation<br />Light integration on the business keys<br />Specific business rule execution<br />Structure and value transformation<br />Consolidation<br />
    74. 74. Speeding up Data Vault Evolution?<br />
    75. 75. Data Vault Innovation<br />
    76. 76.
    77. 77. My PoV<br /><ul><li>Generation is an aid, not a goal in itself
    78. 78. 80-20 rule
    79. 79. Not everything can be generated, be real
    80. 80. Truly understand the mechanics
    81. 81. Do not underestimate the complexity
    82. 82. It is a pretty steep learning cycle
    83. 83. PoC, PoC, PoC
    84. 84. Make it yourself
    85. 85. Then “generate”
    86. 86. Generation software only in combo with modeling software and ETL software
    87. 87. From staging to DV  ETL tool is overhead?</li></li></ul><li>Metamodel driven automation<br /><ul><li>Models (process, rules and data) determine the metadata, the metadata determines the automation artifacts
    88. 88. Aim is to be 100% declarative
    89. 89. It can not be generated all, specific tailored metadata will remain necessary</li></ul>Data Vault implementations<br />Metadata driven automation- Inputs: Source model(s), target model, Template Design, Naming conventions<br />- Advanced inputs: Normalization preferences, Ontologies<br />Taken from Dan Linstedt’sblog post: http://danlinstedt.com/datavaultcat/code-generation-for-data-vault-not-as-easy-as-you-think/<br />Template driven automation<br /><ul><li>In the most basic forms; documentation - describing a pattern
    90. 90. More advanced; generating XML code for 2nd gen. ETL tooling
    91. 91. Vb - http://www.grundsatzlich-it.nl/bi-tools-templator.html</li></li></ul><li>Automation typology<br />Thosethat support species #1 (building a Source Vault)<br />Template driven or Metadatadriven<br />Oftengenerates the model and the logistics<br />Thosethat support species #2 (building a Classic Vault)<br />Template driven or Metadatadriven<br />Generate (metadata of) the logistics<br />Modelingremains a craft– IDENTIFY THE BUSINESS KEYS<br />Thosethat go beyond<br />Metamodel driven<br />Based on the business process, the rulesand the data<br />The datamodel (DV, AM, ..) is a consequence of the process<br />Support for ALM characteristics<br />
    92. 92. Next evolutionary step?<br /><ul><li>Both DV species will remain viable
    93. 93. Both DV species will live next to others
    94. 94. Technology will push the envelope
    95. 95. Automation typologies will be pushed even more towards metadata driven automation
    96. 96. WE – the people – need to evolve:
    97. 97. We need more talented engineers
    98. 98. We need more focused education
    99. 99. We need more knowledge sharing
    100. 100. We need more academic research</li></li></ul><li>Thank You<br />Ronald Damhofis an independent practitioner in the field of data management and decision support. Graduated in 1995 in the study of Economics. Since 1995 he worked as a practitioner into the field of Information Management with a focus on decision support and data management, trying hard to enhance the rigor and relevance in these fields by combining scientific research with the everyday challenges of the practitioner. Ronald is mainly hired by customers in the role of business/IT architect, auditor, coach & trainer. He blogs on B-Eye-Network.com, is a member of the prestigious BBBT, wrote several articles regarding decision support architectures and is a researcher in the field of Information Management. Although Ronald likes to work with theoretical grounded research and proven practices, Ronald is not a 'white paper' architect; put your money where your mouth is, is his motto. He likes to see architectures 'live' in enterprises, not just write about it. In most organizations his role extends architecture often. He needs to be a missionary (selling the value of an information architecture), a project manager (getting it done) and a specialist (educating hardware peeps, data architects, data logistics etc.). But that’s the beauty of the trade... <br />

    ×