SlideShare a Scribd company logo
1 of 35
The Data Warehouse Lifecycle,[object Object],Bart Lowe,[object Object],Decision Source Inc.,[object Object]
Agenda,[object Object],Discuss high level concepts related to Data warehousing,[object Object]
The Data Warehouse	Must…,[object Object],Make information easily accessible,[object Object],Present information consistently,[object Object],Be adaptive & resilient to change,[object Object],Be Secure,[object Object],Serve as the foundation for decision making,[object Object]
The Business Community Must…,[object Object],Accept and trust the data warehouse if it is to be successful,[object Object]
Data Warehouse Lifecycle,[object Object]
Data Warehouse Lifecycle,[object Object]
Data Warehouse Components,[object Object]
Data Warehouse Components,[object Object]
Source Systems,[object Object],[object Object]
Data schemas optimized for transactions not queries
Difficult to share data
Typically do not maintain historical data,[object Object]
This is typically the most difficult and labor intensive component
Data is cleansed & conformed
Typically a normalized data schema
No direct querying is allowed to this component,[object Object]
Consists of a series of conformed dimensional data marts
Each Data Mart represents a difference business process.
Dimensional modeling emphasizes simplicity & query performance.,[object Object]
In General only a small subset of users will need true ad-hoc query capability
80-90% of users will used a parameterized analytic system,[object Object]
Fact Table,[object Object],This is the primary table in a dimensional model,[object Object],The measurements of the dimensional model are stored here,[object Object],Each measurement is tracked at the intersection of several dimensions,[object Object],This is the “grain” of the model,[object Object],Most useful facts are additive,[object Object]
Dimension Table,[object Object],Descriptors of each fact,[object Object],Tend to have many attributes but fewer rows,[object Object],Tend to be used as query constraints.,[object Object],The better the attribute descriptions the better the warehouse,[object Object],Typically highly denormalized,[object Object]
Star Schema,[object Object],This is a fact table joined to a set of dimensions,[object Object],Relates data in a manner that is familiar to business users.,[object Object],Symmetrical nature allows for many answering many different business questions,[object Object],One dimensional model will exist for each business process. ,[object Object],A  single data warehouse can have dozens of these models.,[object Object]
Dimensional Modeling Key Concepts,[object Object]
Store the most atomic data,[object Object],By storing the most detail data possible you can ensure that users can drill to the level they need. ,[object Object],Its OK to provide aggregate facts as well to improve performance.,[object Object]
Conformed Dimensions,[object Object],By conforming your dimensions you can correlate performance across business processes.,[object Object],Can be very painful (but worth it) if combining data from disparate systems.,[object Object]
Always use an artificial key as the primary key,[object Object],Surrogate Key allow you to:,[object Object],Protect you model from changes in the source system,[object Object],Integrate data from multiple sources,[object Object],Add rows that do not exist in the source system.,[object Object],Track changes to dimensions over time.,[object Object],Use Surrogate Keys,[object Object]
A key design consideration is what to do when dimension values change.,[object Object],A change may or may not have business meaning.,[object Object],Three ways to handle changes,[object Object],Slowly Changing Dimensions,[object Object]
Slowly Changing Dimension Types,[object Object],Type I,[object Object],Simply overwrite the old values.,[object Object],Simplest case, used when you don’t care about changes to data.,[object Object],Type II,[object Object],Create a new dimension row for new values,[object Object],Existing facts still relate to old dimension value,[object Object],Used when you do care about the historical changes.,[object Object],Type III,[object Object],Add a new column to table to store the new value,[object Object],Rarely used.,[object Object]
Dates are a fundamental Business concept and nearly every DW has a date dimension,[object Object],The date dimension is the classic role playing dimension.,[object Object],Allows rollups/filters on any date related attribute such as month/quarter/year ,[object Object],Date dimension records still use a surrogate to handle unknown dates.,[object Object],Date Dimensions,[object Object]
Snowflaking is the process of hooking up lookup tables to a dimension.,[object Object],This is in a way re-normalizing the data.,[object Object],Snowflaking is in general discouraged since it adds complexity to the model.,[object Object],Snowflaking,[object Object]
Most relationships are one-to-many.  This is the simplest case.,[object Object],Real world scenarios are often more complex.,[object Object],Many to Many between facts & dimensions are represented by creating a bridge table between the facts and the dimension,[object Object],Many to Many Relationships,[object Object]
Hierarchies summarize or group the data within the dimension.,[object Object],Typically are de-normalized into the dimension table,[object Object],Hierarchies,[object Object]
There are three types of fact tables,[object Object],Transaction,[object Object],Tracks each transaction as it occurs.,[object Object],Periodic Snapshot,[object Object],Captures cumulative performance over a specific period of time,[object Object],Often used for periodic rollups,[object Object],Accumulating Snapshot,[object Object],Updated over time,[object Object],Types of Fact Tables,[object Object]

More Related Content

What's hot

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDATAVERSITY
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional ModellingVincent Rainardi
 
Dbms Notes Lecture 4 : Data Models in DBMS
Dbms Notes Lecture 4 : Data Models in DBMSDbms Notes Lecture 4 : Data Models in DBMS
Dbms Notes Lecture 4 : Data Models in DBMSBIT Durg
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Data Wrangling
Data WranglingData Wrangling
Data WranglingGramener
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouseAmin Choroomi
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension Sunita Sahu
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdfBOSupport
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 

What's hot (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best Practices
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional Modelling
 
Dbms Notes Lecture 4 : Data Models in DBMS
Dbms Notes Lecture 4 : Data Models in DBMSDbms Notes Lecture 4 : Data Models in DBMS
Dbms Notes Lecture 4 : Data Models in DBMS
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
 
Data Quality
Data QualityData Quality
Data Quality
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdf
 
Data integration
Data integrationData integration
Data integration
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Dbms notes
Dbms notesDbms notes
Dbms notes
 

Similar to The Data Warehouse Lifecycle

Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data WarehousesMichael Lamont
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Caserta
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptSumathiG8
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guidethomasmary607
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptDougSchoemaker
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.pptBsMath3rdsem
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 

Similar to The Data Warehouse Lifecycle (20)

Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 

The Data Warehouse Lifecycle

Editor's Notes

  1. These may seem simple but these principles are the foundation for the deign methodology.For business users to be able to navigate the system the tools and most importantly the data must simple and easy to use.Consistency requires a thorough ETL process to cleanse & conform the data.Change is inevitable. We need a design that is resilient to change.Security …Must have the right data in order to support decisions this means up front analysis focuses on the business need
  2. Ultimately if any system doesn’t satisfy some business need, it is of no value and is a failure.
  3. Go through Each Component.
  4. Discuss Each bullet point
  5. Discuss Each bullet pointExamples of Cleansing activities:MisspellingsFormattingCapitalization ConformanceEmphasize that users are forbidden from executing queries on these data.3NF Data is to complex for most users.3NF is not optimized for query performance.
  6. Discuss Each bullet pointDiscuss what it means to be a conformed data martPoint out that dimensional modeling will be discussion in detail later on
  7. Discuss Each bullet pointSpecify the examples in this diagram and what role they play
  8. Discuss why additive facts are most usefulDescribe Semi additive factsNote that the primary key is the combo of all the foreign keys. A ROWID add little value and the index probably would be of any us either.
  9. Attribute description should avoid cryptic abbreviationsMinimize the use of codesShow the denormalized nature of one of these dimensions.Denormalized dimensions provide the following benefits.Simplified structure for non technical usersBetter query performanceSince dimensions typically have a relatively few number of rows the impact of reduced storage efficiency is minimal
  10. Walk through SCD2 example using dimensional model above
  11. Walk through the date dimension in the POC example
  12. Point out that the POCGLTransaction fact table is a transaction fact tableAnd the budget table is a periodic snapshot.