Designing A Data Warehouse With Sql 2008

1,744
-1

Published on

The right reason to build a DW/BI system

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,744
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Designing A Data Warehouse With Sql 2008

  1. 1. DESIGNING A DATA WAREHOUSE WITH SQL SERVER 2008 Joy Mundy, joy@kimballgroup.com Introductions and Background Presenter Joy Mundy, Kimball Group Kimball Group The authors of the Data Warehouse Toolkit series of books, including the Microsoft Data Warehouse Toolkit Kimball University DW / BI courses DW / BI strategic consulting 2 © 2005-2009 Kimball Group. All rights reserved. Page 1
  2. 2. Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 3 Some Possible Reasons A. The CIO told us to B. It sounds like fun C. It’s a great opportunity for us to add significant value to the business D. We’re not building a DW/BI system, just an Executive Dashboard Which one of these is “Right”? 4 © 2005-2009 Kimball Group. All rights reserved. Page 2
  3. 3. Answer: C. It is a great opportunity to add business value It is also a great opportunity to: Work with senior management Advance your career Play with fun, new technology However, there are a few risks… 5 DW / BI System Risks High profile Success (and failure) is visible to senior management Business driven – can be hard for technologists Technology focus is rarely successful “Build it and they will come” doesn't work Dashboards are appropriate for mature DW/BI systems, but are not a starting point Data quality and integration are hard problems, even if the technology works well The project is complex and politically challenging Follow a proven approach 6 © 2005-2009 Kimball Group. All rights reserved. Page 3
  4. 4. Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 7 Architecture Principles Business requirements determine architecture Listen to business requirements and translate them into functional components This means your DW/BI system architecture will not be the same as your neighbor’s Do not build major DW/BI components because you are supposed to 8 © 2005-2009 Kimball Group. All rights reserved. Page 4
  5. 5. Architectural Approaches Build reports directly from the transaction systems Standalone marts Normalized data warehouse feeding downstream marts Kimball dimensional data warehouse 9 Standalone Marts Pros •Marts reflect business Sales requirements Mart •Get business value KPI this year Sales+ View Sources Cons •HR •Multiple extracts of •Projects the same data •Siebel •Multiple transforms •Skills Dtb CSAT+ RoB •Inconsistent versions •CustSat files of the same data •Sales •SAP CSAT •10th mart takes as •FeedWrx old long to build as first •Business lists EMR Capacity •Many others Planning PCD DIM ESRT CFR Others © 2005-2009 Kimball Group. All rights reserved. Page 5
  6. 6. Normalized DW and Downstream Marts Pros Sales+ •Data extracted and consolidated only once •Marts reflect business KPI View requirements Sources •HR Cons •Projects •Takes too long to •Siebel Enterprise Data CSAT+ build a new mart •Skills Dtb Warehouse •Too many business •CustSat files (not dimensional) rules between EDW & •Sales •Integrated •SAP marts; we still get •FeedWrx •Historical RoB inconsistencies •Design reflects source •Business systems •EDW is by (and for) IT, lists Capacity using its language and •Many others Planning structures DIM •Marts are for the business ESRT Others Kimball Dimensional Data Warehouse Pros •Data extracted and User consolidated only applications once Kimball-style •DW design meets •Most “marts” Dimensional become views into business requirements Enterprise Data the enterprise •Data is structured to Sources Warehouse system support easy analytic •HR •Integrated & historical •Ad hoc use is •Projects use with good perf •Design reflects analytic supported and •Siebel encouraged •Data and terms are requirements •Skills Dtb •Built incrementally consistent •CustSat files •Contains the most •Once data is in the •Sales detailed data possible DW, building new KPIs •SAP •Fact data hooks or BI applications is •FeedWrx together via shared Mart A •Business much easier (conformed) dimensions lists •Presentation area is •Many others relational or OLAP Mart B Cons •OLAP is recommended •Takes longer to get for Msft platform We may supplement the biz value than simply •(Still need relational main DW/BI system with a throwing together a DW) handful of custom BI apps mart that meet specific needs. These are the exception. © 2005-2009 Kimball Group. All rights reserved. Page 6
  7. 7. Summary of Architectures Approach Trxn system Ease of use Time to market burden Report directly Very high Very poor Poor from trxn systems Departmental marts Moderate Good until you need “90 days”, no economies something new. of scale Navigation challenges Normalized DW + Low DW = poor Huge up-front marts Marts = good until investment. Marts are you need something “60 days” new Navigation challenges Kimball dimensional Low Very good Large up-front DW investment. Excellent economies of scale. 13 The Microsoft DW/BI Technical Architecture Metadata Dimensionalization Source Systems Business/Extract Business Users Data Quality OLAP Rules RDBMS •SharePoint •Report Builder •Performance Point © 2005-2009 Kimball Group. All rights reserved. Page 7
  8. 8. Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 15 Kimball Method Basic Principles Business driven Iterative Lifecycle Dimensional model for data delivery Enterprise data framework Bus Matrix Conformed dimensions Full solution from extracts to business value 16 © 2005-2009 Kimball Group. All rights reserved. Page 8
  9. 9. The Kimball DW/BI Lifecycle Technical Product Architecture Selection & Design Installation Growth Business Project Require- Dimensional Physical ETL Design & Planning ments Modeling Deployment Design Development Definition BI BI Maintenance Application Application Specification Development Project Management Key Concepts: - Business centric - Dimensional delivery - Full solution - Iterative process - Enterprise aware - Incremental growth 17 Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 18 © 2005-2009 Kimball Group. All rights reserved. Page 9
  10. 10. Business Requirements (1) Interview key people across the org Ask “What do you do?” not “What do you want?” It is our job to design the solution, not theirs Look for common analytic themes Better promotion response rate Improve sales performance Break themes down into business processes that generate needed data Promotions Responses Orders 19 Business Requirements (2) Design the data warehouse Enterprise Bus Matrix Prioritize themes with senior management Summarize finding in a Requirements Document Identify and recruit good business sponsor(s) Visionary Influential Reasonable 20 © 2005-2009 Kimball Group. All rights reserved. Page 10
  11. 11. Profile the Data Early and often Does the data exist to support the required analysis? Where are the problems affecting ETL design Primary keys Referential integrity NULL values Junk values The dreaded “Notes” field SSIS 2008 has useful data profiling functionality 21 Requirements Prioritization Based on Value and Feasibility High Customer Profitability Orders Promotions Product Orders Profitability Forecast Value / Impact Business Shipping Call Tracking Returns Manufacturing Costs Exchange Rates Low Low Feasibility High Key Concepts: Created in a meeting with Senior Mgmt. Relative value is a business decision Boxes come from Bus. Requirements Relative feasibility needs IT input © 2005-2009 Kimball Group. All rights reserved. Page 11
  12. 12. Enterprise Bus Matrix Adventure Works <-- Conformed Dimensions --> Data Warehouse Key Concepts: Internet Registered User Date (Order, Start, Ship) Bus Matrix The high level DW/BI data architecture Business Priority Rows = Business End Customer Processes Promotion Columns = Conformed Employee Problem Reseller Product Shipper Vendor Dimensions Page DW/BI system Part Business Process Orders Forecasting 2 x x x x x implemented row by row Reseller Orders 1 x x x x x based on business priority Internet Orders 1 x x x x x x Purchasing x x x x x x x Parts Inventory x x x x x Manufacturing 6 x x x Finished Goods Inv. x x x Shipping 3 x x x x x x x Returns 5 x x x x x x Customer Calls 4 x x x x x x x x Web Support 4 x x x x x x x x 23 Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 24 © 2005-2009 Kimball Group. All rights reserved. Page 12
  13. 13. Microsoft Technology for the DW Back Room Integration Services is a competitive ETL tool Great performance, solid toolbox Relational Database is strong BI platform Key BI-related features, including partitioning, compression, and star join optimization Analysis Services is OLAP market leader Dimensional design is flexible More scalable and manageable Data Mining – strong mining platform, leverages AS for speed; good integration 25 Relational vs. OLAP (Why OLAP?) Relational strengths Data management Flexibility OLAP strengths Analytic language Ad hoc query performance Metadata layer Security, especially for ad hoc queries 26 © 2005-2009 Kimball Group. All rights reserved. Page 13
  14. 14. Microsoft Technology for the DW/BI Front Room Reporting Services Good enterprise platform Programmer-oriented report designer Limited ad hoc query Data presentation Office, SharePoint, [ProClarity] Integrated development (VS) and management environments Scale – technology can scale to multi-TBs Plan to spend more time and $, including on significant consulting expertise. Real-time features 27 Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 28 © 2005-2009 Kimball Group. All rights reserved. Page 14
  15. 15. The Dimensional Model (the Target) Based on top business priority data area Fact table = measurement of business events Dimension tables = objects that participate in business events (Customer, Product, Date, …) Surrogate keys (meaningless integer) Slowly changing dimensions Type 1 = Overwrite old values with new Type 2 = Add a new row when values change Identify data quality issues now 29 Relational Dimensional Model Date Product Sales Fact Product Key Customer Key Date Key … other keys Sales Amount Other dims… Sales Quantity Customer … other measures © 2005-2009 Kimball Group. All rights reserved. Page 15
  16. 16. Surrogate Keys Dimension PKs should be surrogate (meaningless) keys Managed by the DW Usually an integer type Usually populated via IDENTITY keyword in dimension table definition Why? Small (int) keys are vital for performance The source system will re-use keys. They swear they won’t. But they will. Enables dimension attribute change tracking 31 Surrogate Keys and ETL Dimensions Carry source system key(s) as non-key attributes in the dimension New rows automatically get a new surrogate key Facts Fact table usually does not contain source system keys Final step of fact processing is to exchange the source system keys for DW surrogate keys Lookup to dimension tables based on source key, returning surrogate key 32 © 2005-2009 Kimball Group. All rights reserved. Page 16
  17. 17. Conformed Dimensions One master dimension table that all fact tables subscribe to Get agreement organization-wide on: What the dimensions are called Which hierarchies you have Similar-but-different attributes and hierarchies have different names Which attributes are managed by restating history and which by tracking history Create two sets of attributes if you need it both ways Why? Single version of the truth Flexibility of basic design 33 Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 34 © 2005-2009 Kimball Group. All rights reserved. Page 17
  18. 18. The Need for BI Applications Approximately 10% of your user population will learn how to build ad hoc queries They must learn the tool AND the data This means you must build applications to provide access to the other 90% Structured Flexible (parameters, pick lists, formats) Well organized 35 BI Application Steps 1. BI Application design and specs Right after business requirements Template, mock-ups, specs, navigation framework 2. BI Application development Can’t start until data and tools are available Pull out your specs and get to work Best to do this as part of Beta testing 36 © 2005-2009 Kimball Group. All rights reserved. Page 18
  19. 19. Standard Reports We recommend going live with a modest number of reports (8-12) Enlist business users in creating and QA-ing reports Users don’t know what they want until you show them something Lots of reports are “theme and variations” – parameterize them! Build a BI portal to host the reports Brand it with the DW/BI logo Add useful info about operations, contents, and help 37 Advanced BI Applications Planning and forecasting applications You need a decent history of fairly accurate data before you can plan / forecast Planning and forecasting activities are highly analytic, with a little bit of writeback Heavy emphasis on “what-if” Data mining Collection of statistical techniques to identify trends and correlations Requires detailed (atomic) data Can be the most valuable thing you do with your DW/BI system Advanced BI apps are not Phase 1 projects 38 © 2005-2009 Kimball Group. All rights reserved. Page 19
  20. 20. Agenda The “Right” reason to build a DW/BI system The Kimball Architecture The Kimball Method and Lifecycle Business Requirements Technology Track Data Track BI Applications Operations, Maintenance, and Growth 39 Deployment, Maintenance, and Growth Deployment has two major components Software and data availability (dev, test, prod) User preparedness (training, documentation, and support) Maintenance Monitor usage and performance Anticipate problems Growth Iterate back through the Lifecycle with the next priority business process 40 © 2005-2009 Kimball Group. All rights reserved. Page 20
  21. 21. Session Summary The DW/BI system can be high value, but it is definitely high risk Reduce risk by using an approach based on business requirements a flexible data architecture delivering the full solution Microsoft SQL Server 2008 provides the full technology stack for DW/BI systems SQL Server 2008 is well suited for the Kimball Method 41 Next Steps Learn about your business Strategies, challenges, opportunities, terms Industry, competition, trends Learn the Kimball Method Learn about adding business value Learn the Lifecycle approach Learn the Microsoft SQL Server 2008 DW/BI toolset Get started! Do a high level requirements definition and prioritization 42 © 2005-2009 Kimball Group. All rights reserved. Page 21
  22. 22. For More Information… Kimball University Next 4-day Microsoft class on 3/31 (Chicago). Stockholm in May. Other classes in modeling, lifecycle, and ETL throughout the year Websites www.kimballgroup.com: articles and design tips forum.kimballgroup.com: the Kimball Forum Kimball Books The Microsoft Data Warehouse Toolkit, Joy Mundy and Warren Thornthwaite with Ralph Kimball, Wiley, 2006 (the Microsoft book) The Data Warehouse Toolkit 2nd Edition, Ralph Kimball and Margy Ross, Wiley, 2002 (the modeling book) The Data Warehouse Lifecycle Toolkit 2nd Edition, Kimball, Ross and Thornthwaite, Wiley, 2008 (how to build a DW) The Data Warehouse ETL Toolkit, Kimball and Caserta, Wiley, 2004 (ETL theory and practice) 43 © 2005-2009 Kimball Group. All rights reserved. Page 22

×