The final frontier
Upcoming SlideShare
Loading in...5

The final frontier






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    The final frontier The final frontier Presentation Transcript

    • Agile Data Warehouse The Final Frontier Terry Bunio
    • Agile Data Warehouse The Final Frontier
    • Terry
    • 1 Data Warehouse Architecture2 Spectre of the Agility3 The Prime Directive4 Captain, we need more visualization!5 The United Federation of Planets
    • Data Warehouse• Definition – “a database used for reporting and data analysis. It is a central repository of data which is created by integrating data from multiple disparate sources. Data warehouses store current as well as historical data and are commonly used for creating trending reports for senior management reporting such as annual and quarterly comparisons.” –
    • Data Warehouse• Can refer to: – Reporting Databases – Operational Data Stores – Data Marts – Enterprise Data Warehouse – Cubes – Excel? – Others
    • Relational• Relational Analysis – Third Normal Form – OLTP – Normalized tables optimized for modification
    • Dimensional• Dimensional Analysis – Star Schema – OLAP – Facts and Dimensions optimized for retrieval • Facts – Business events – Transactions • Dimensions – context for Transactions – Accounts – Products – Date
    • Relational
    • Dimensional
    • Kimball-lytes• Bottom-up - incremental – Operational systems feed the Data Warehouse – Data Warehouse is a corporate dimensional model that Data Marts are sourced from – Data Warehouse is the consolidation of Data Marts – Sometimes the Data Warehouse is generated from Subject area Data Marts
    • Inmon-ians• Top-down – Corporate Information Factory – Operational systems feed the Data Warehouse – Enterprise Data Warehouse is a corporate relational model that Data Marts are sourced from – Enterprise Data Warehouse is the source of Data Marts
    • The gist…• Kimball‟s approach is easier to implement as you are dealing with separate subject areas, but can be a nightmare to integrate• Inmon‟s approach has more upfront effort to avoid these consistency problems, but takes longer to implement.
    • Spectre of the Agility
    • Data Warehouse Project Incremental - KimballWaterfall - Inmon •In Segments•Detailed Analysis •Detailed Analysis•Large Development •Development•Large Deploy •Deploy•Long Feedback loop •Long Feedback loop•Extensive changes •Considerable changes•Many Defects •Rework •Defects
    • Popular Agile Data Warehouse Pattern • Analyze data requirements department by department • Create Reports and Facts and Dimensions for each • Integrate when you do subsequent departments
    • The problem• Conforming Dimensions – A Dimension conforms when it is in equivalent structure and content – Is an account defined by Marketing the same as Finance? • Probably not – If the Dimensions do not conform, this severly hampers the Data Warehouse
    • Where is she?
    • Where is the true Agility?• Iterations not Increments• Brutal Visibility/Visualization• Short Feedback loops• Just enough requirements• Working on enterprise priorities – not just for an individual department
    • Our Mission• “Data... the Final Frontier. These are the continuing voyages of the starship Agile. Her on-going mission: to explore strange new projects, to seek out new value and new clients, to iteratively go where no projects have gone before.”
    • The Prime Directive
    • The Prime Directive• Is a vision or philosophy that binds the actions of Starfleet• Can an Data Warehouse project truly be Agile without a Vision of either the Business Domain or Data Domain? – Essentially it is then just an Ad Hoc Data Warehouse. Separate components that may fit together. – How do we ensure we are working on the right priorities for the entire enterprise?
    • A new model
    • Agile Enterprise Data Model• Confirms the major entities and the relationships between them – 30-50 entities• Confirms the Business and Data Domains• Starts the definition of a Data Model that will be refined over time – Completed in 1 – 4 weeks
    • An Agile Enterprise Data Model• Is just enough to understand the domain so that the iterations can proceed• Is not mapping all the attributes• Is not BDUF• Is a User Story Map for the Data Domain• Contains placeholders for refinement
    • Agile Enterprise Data Model is a Data Map
    • Agile Enterprise Data Model• Is – Our vision – Our User Story Map for the Data Domain – Guides our solution – Our Prime Directive – A Data Model
    • Kimball or Inmon?
    • Spock• Hybrid approach – It is only logical – Needs of the many outweigh the needs of the few – or the one
    • Spock Approach Agile Enterprise Data Model
    • Spock Approach• Agile Enterprise Data Model• Operational Data Store• Dimensional Data Warehouse• Reporting can then be done from: – Operational Data Store – Dimensional Data Warehouse – New Data Marts
    • Benefits of Spock Approach• Agile Enterprise Data Model – Validates knowledge of Data Domain – Ensure later increments don’t uncover data that was previously unknown and hard to integrate • Minimizes rework – True iterations • Confirm at high level and then refine
    • Benefits of Spock Approach• Operational Data Store – Model data relationally to provide enterprise level operational reports – Consolidate and cleanse data before it is visible to end-users – Is used to refine the Agile Enterprise Data Model
    • Benefits of Spock Approach• Dimensional Data Warehouse – Model data dimensionally to validate domain – Able to provide data to analytical reports – Able to provide full historical data and context for reports – Able to provide clients with the ability to generate their own reports easily
    • How do we work iteratively ona Data Warehouse?
    • Increments versus iterations• Increments – Series by series – department by department• Iterations – Story by story – episode by episode• Enterprise prioritization – Work on the highest priority for the enterprise – Not just within each series/department
    • Captain, we need more Visualization!
    • His pattern indicates 2 dimensional thinking
    • Data visualization
    • Data Visualization• Is required to: – Provide a visual data backlog – Provide a visualization of the data requirements across dimensions – Plan iterations – Lead into the creation of FACT tables and Dimensions in a Data Warehouse
    • • For an Agile Data Warehouse we must think and visualize in more dimensions• We must create a User Story Map for Data Requirements that is both intuitive and informing – Primarily for clients! – They should be able to look at it and understand what is currently being worked on
    • We need a bigger metaphor
    • Payments Sales Data Hexes MasterTransactions Invoices Data Bills Operations
    • Bill Payment Hex• Can have up to six dimensions of how payments are sliced, diced, and aggregated• Concentric hexes allow for the planning of iterations
    • Cardassian Union
    • Be careful how you spell that…
    • Data Modeling Union• For too long the Data Modelers have not been integrated with Software Developers• Data Modelers have been like the Cardassian Union, not integrated with the Federation
    • Issues• This has led to: – Holy wars – Each side expecting the other to follow their schedule – Lack of communication and collaboration• Data Modelers need to join the „United Federation of Projects‟
    • Tools of the trade
    • Version Control
    • Version Control• If you don‟t control versions, they will control you• Data Models must become integrated with the source control of the project – In the same repository of project trunk and branches
    • Our Version Experience• We are using Subversion• We are using Oracle Data Modeler as our Modeling tool. – It has very good integration with Subversion – Our DBMS is SQL Server• Unlike other modeling tools, the data model was able to be integrated in Subversion with the rest of the project
    • Shameless plug• Free• Subversion Integration• Supports Logical, Relational, and Dimensional data models• Since it is free, the data models can be shared and refined by all 60+ members of the development team• Currently on version 889
    • Adaptability
    • Change Tolerant Data Model• Only add tables and columns when they are absolutely required• Leverage Data Domains so that attributes are created consistently and can be changed in unison
    • Change Tolerant Data Model• Don‟t model the data according to the application‟s Object Model• Don‟t model the data according to source systems• These items will change more frequently than the actual data structure• Your Data Model and Object Model should be different!
    • Re-Factoring – Read It
    • Create the plan for how youwill re-factor
    • Plan for:• Versioning – Major and minor• Adaptability – How the data model will adapt to major changes• Refinement – How will iterations be planned and executed• Re-Factoring – How will the data design be re-factored
    • Assimilate
    • Assimilate• Assimilate Version Control, Adaptability, Refinement, and Re-Factoring into core project activities – Stand ups – Continuous Integration – Check outs and Check Ins• Make them part of the standard activities – not something on the side
    • Our experience
    • Current Stardate• We are reaching the end of Operational Data Store ETL Development – ODS has been refined as we progress – no major changes – Data Warehouse Dimensional Model is also complete – Initial Reports analysis is complete and report backlog will soon be started on
    • Summary• Use an Agile Enterprise Data Model to provide the vision• Strive for Iterations over Increments – Spock Approach assists in this• Use Data Hexes to provide brutal visibility and Iteration planning• Plan and Integrate processes for Versioning, Adaptability, Refinement, and Re-Factoring
    • What doesn‟t change?
    • Leadership
    • Leadership• “If you want to build a ship, dont drum up people together to collect wood and dont assign them tasks and work, but rather teach them to long for the endless immensity of the sea.” ~ Antoine de Saint-Exupery
    • Leadership• “[A goalies] job is to stop pucks, ... Well, yeah, thats part of it. But you know what else it is? ... Youre trying to deliver a message to your team that things are OK back here. This end of the ice is pretty well cared for. You take it now and go. Go! Feel the freedom you need in order to be that dynamic, creative, offensive player and go out and score. ... That was my job. And it was to try to deliver a feeling.” ~ Ken Dryden
    • Agile Data Warehouse The Final Frontier Terry Bunio @tbunio