SlideShare a Scribd company logo
1 of 20
Download to read offline
Introduction to Data
Warehouse
duantv@antsprogrammatic.com
• What is Data Warehouse (DW)?
• Purpose and Definition
• DW Architecture
• Multidimensional Model
• Schema
Discussion
What is Data Warehouse (DW)?
“Data warehousing is a collection of methods, techniques, and tools
used to support knowledge workers—senior managers, directors,
managers, and analysts—to conduct data analyses that help with
performing decision-making processes and improving information
resources.”
Purpose and Definition
• DW is a store of information organized in a unified data model
• Data collected from a number of different sources
• Finance, billing, website logs, personnel, …
• Purpose of a data warehouse (DW): support decision making
• Easy to perform advanced analysis
• Ad-hoc analysis and reports
• Data mining: discovery of hidden patterns and trends
DW
• Solution: new analysis environment (DW) where data are
Subject oriented (versus function oriented)
Integrated (logically and physically)
Time variant (data can always be related to time)
Stable (data not deleted, several versions)
• Data from the operational systems are
Extracted
Cleansed
Transformed
Aggregated
Loaded into the DW
DW Architecture
DW Architecture: Function vs. Subject Orientation
DW Architecture: Top-down vs. Bottom-up
DW Architecture
Single-layer Two-layer
Three-layer
DW Architecture: Additional Architecture
Independent data marts
Hub-and-spoke
Federated
Extraction, transformation and loading (ETL)
Multidimensional Model
• The multidimensional model
• Its only purpose: data analysis
• It is not suitable for OLTP systems
• More built in “meaning”
• What is important
• What describes the important
• What we want to optimize
• Easy for query operations
• Data is divided into:
• Facts
• Dimensions
• Facts are the important entity: a sale
• Facts have measures that can be aggregated
• Dimensions describe facts
• A sale has the dimensions Product, Store and Time
• Facts “live” in a multidimensional cube
Cubes
• Dimensions are the core of multidimensional databases
• Dimensions are used for
• Selection of data
• Grouping of data at the right level of detail
• Dimensions consist of dimension values
• Dimension values may have an ordering
• Used for comparing cube data across values
• Especially used for Time dimension
• Dimensions have hierarchies with levels
• Typically 3-5 levels (of detail)
• Dimension values are organized in a tree structure
• Dimensions have a bottom level and a top level
• Dimensions should contain much information
Dimensions
Facts
• Facts represent the subject of the desired analysis
• The ”important” in the business that should be analyzed
• A fact is identified via its dimension values
• A fact is a non-empty cell
• The fact table stores facts
• One column for each measure
• One column for each dimension (foreign key to dimension table)
• Dimensions keys make up composite primary key
Measures
• Measures represent the fact property that the users want to study
and optimize
• A measure has two components
• Numerical value: (revenue)
• Aggregation formula (SUM): used for aggregating/combining a number of
measure values into one
• Measure value determined by dimension value combination
• Measure value is meaningful for all aggregation levels
Schema
Schema: Star Schema
• One fact table
• De-normalized dimension tables
• A dimension table stores a
dimension
• Optimized for querying large data
sets to support to support OLAP
cubes / analytical applications
Schema: Snowflake Schema
• Dimensions are normalized
• One dimension table per level
Q&A

More Related Content

What's hot

Enterprise Architecture
Enterprise Architecture Enterprise Architecture
Enterprise Architecture gdavie
 
Data Preprocessing&tools
Data Preprocessing&toolsData Preprocessing&tools
Data Preprocessing&toolsAmandeep Gill
 
Data pre processing
Data pre processingData pre processing
Data pre processingpommurajopt
 
SAG_Indexing and Query Optimization
SAG_Indexing and Query OptimizationSAG_Indexing and Query Optimization
SAG_Indexing and Query OptimizationVaibhav Jain
 
Data Mining Scoring Engine development process
Data Mining Scoring Engine development processData Mining Scoring Engine development process
Data Mining Scoring Engine development processDylan Wan
 
Data Science and Data Visualization (All about Data Analysis) by Pooja Ajmera
Data Science and Data Visualization (All about Data Analysis) by Pooja AjmeraData Science and Data Visualization (All about Data Analysis) by Pooja Ajmera
Data Science and Data Visualization (All about Data Analysis) by Pooja AjmeraPooja Ajmera
 
Preprocessing
PreprocessingPreprocessing
Preprocessingmmuthuraj
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...Big Data Value Association
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data MiningIffat Firozy
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
Systems Development and Documentation Techniques
Systems Development and Documentation TechniquesSystems Development and Documentation Techniques
Systems Development and Documentation TechniquesHamse abdirahmaan
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehousessuser7fc7eb
 

What's hot (19)

Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Enterprise Architecture
Enterprise Architecture Enterprise Architecture
Enterprise Architecture
 
Data
DataData
Data
 
Data Preprocessing&tools
Data Preprocessing&toolsData Preprocessing&tools
Data Preprocessing&tools
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
SAG_Indexing and Query Optimization
SAG_Indexing and Query OptimizationSAG_Indexing and Query Optimization
SAG_Indexing and Query Optimization
 
Data Mining Scoring Engine development process
Data Mining Scoring Engine development processData Mining Scoring Engine development process
Data Mining Scoring Engine development process
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data Science and Data Visualization (All about Data Analysis) by Pooja Ajmera
Data Science and Data Visualization (All about Data Analysis) by Pooja AjmeraData Science and Data Visualization (All about Data Analysis) by Pooja Ajmera
Data Science and Data Visualization (All about Data Analysis) by Pooja Ajmera
 
Preprocessing
PreprocessingPreprocessing
Preprocessing
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Systems Development and Documentation Techniques
Systems Development and Documentation TechniquesSystems Development and Documentation Techniques
Systems Development and Documentation Techniques
 
Data PreProcessing
Data PreProcessingData PreProcessing
Data PreProcessing
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 

Similar to Introduction to Data Warehouse

Data warehouse 17 dimensional data model
Data warehouse 17 dimensional data modelData warehouse 17 dimensional data model
Data warehouse 17 dimensional data modelVaibhav Khanna
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Caserta
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introductionMurli Jha
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptRafiulHasan19
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseRussel Chowdhury
 
Data warehouse 18 logical dimensional model for data warehouse
Data warehouse 18 logical dimensional model for data warehouseData warehouse 18 logical dimensional model for data warehouse
Data warehouse 18 logical dimensional model for data warehouseVaibhav Khanna
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Terry Bunio
 
Asper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling TopicsAsper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling TopicsTerry Bunio
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousingShahed Khalili
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemKiran kumar
 
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsDay 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsAseda Owusua Addai-Deseh
 

Similar to Introduction to Data Warehouse (20)

Data warehouse 17 dimensional data model
Data warehouse 17 dimensional data modelData warehouse 17 dimensional data model
Data warehouse 17 dimensional data model
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
RowanDay4.pptx
RowanDay4.pptxRowanDay4.pptx
RowanDay4.pptx
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional Database
 
Data warehouse 18 logical dimensional model for data warehouse
Data warehouse 18 logical dimensional model for data warehouseData warehouse 18 logical dimensional model for data warehouse
Data warehouse 18 logical dimensional model for data warehouse
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
 
Asper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling TopicsAsper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling Topics
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsDay 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
 

Recently uploaded

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Introduction to Data Warehouse

  • 2. • What is Data Warehouse (DW)? • Purpose and Definition • DW Architecture • Multidimensional Model • Schema Discussion
  • 3. What is Data Warehouse (DW)? “Data warehousing is a collection of methods, techniques, and tools used to support knowledge workers—senior managers, directors, managers, and analysts—to conduct data analyses that help with performing decision-making processes and improving information resources.”
  • 4. Purpose and Definition • DW is a store of information organized in a unified data model • Data collected from a number of different sources • Finance, billing, website logs, personnel, … • Purpose of a data warehouse (DW): support decision making • Easy to perform advanced analysis • Ad-hoc analysis and reports • Data mining: discovery of hidden patterns and trends
  • 5. DW • Solution: new analysis environment (DW) where data are Subject oriented (versus function oriented) Integrated (logically and physically) Time variant (data can always be related to time) Stable (data not deleted, several versions) • Data from the operational systems are Extracted Cleansed Transformed Aggregated Loaded into the DW
  • 7. DW Architecture: Function vs. Subject Orientation
  • 10. DW Architecture: Additional Architecture Independent data marts Hub-and-spoke Federated
  • 12. Multidimensional Model • The multidimensional model • Its only purpose: data analysis • It is not suitable for OLTP systems • More built in “meaning” • What is important • What describes the important • What we want to optimize • Easy for query operations • Data is divided into: • Facts • Dimensions • Facts are the important entity: a sale • Facts have measures that can be aggregated • Dimensions describe facts • A sale has the dimensions Product, Store and Time • Facts “live” in a multidimensional cube
  • 13. Cubes
  • 14. • Dimensions are the core of multidimensional databases • Dimensions are used for • Selection of data • Grouping of data at the right level of detail • Dimensions consist of dimension values • Dimension values may have an ordering • Used for comparing cube data across values • Especially used for Time dimension • Dimensions have hierarchies with levels • Typically 3-5 levels (of detail) • Dimension values are organized in a tree structure • Dimensions have a bottom level and a top level • Dimensions should contain much information Dimensions
  • 15. Facts • Facts represent the subject of the desired analysis • The ”important” in the business that should be analyzed • A fact is identified via its dimension values • A fact is a non-empty cell • The fact table stores facts • One column for each measure • One column for each dimension (foreign key to dimension table) • Dimensions keys make up composite primary key
  • 16. Measures • Measures represent the fact property that the users want to study and optimize • A measure has two components • Numerical value: (revenue) • Aggregation formula (SUM): used for aggregating/combining a number of measure values into one • Measure value determined by dimension value combination • Measure value is meaningful for all aggregation levels
  • 18. Schema: Star Schema • One fact table • De-normalized dimension tables • A dimension table stores a dimension • Optimized for querying large data sets to support to support OLAP cubes / analytical applications
  • 19. Schema: Snowflake Schema • Dimensions are normalized • One dimension table per level
  • 20. Q&A