SlideShare a Scribd company logo
1 of 26
Data Warehousing
Building a database to
support the decision
making activities of a
department or business
unit
Data Warehouse
A read-only database for decision analysis
–Subject Oriented
–Integrated
–Time variant
–Nonvolatile
consisting of time stamped operational
and external data.
Data Warehouse Purpose
Identify problems in time to avoid them
Locate opportunities you might otherwise
miss
Data Warehouse:
New Approach
An old idea with a new interest because
of:
Cheap Computing Power
Special Purpose Hardware
New Data Structures
Intelligent Software
Warehousing Problems
Business Issues
Data Quantity
Data Accuracy
Maintenance
Ownership
Cost
Warehousing Problems
Business Issues
Database Issues
DBMS Software
Technology
Complexity
Business Issues
Data Issues
Analysis Issues
User Interface
Intelligent Processing
Warehousing Problems
EUC Data Architecture
Enterprise
Data
Warehouse
Data
Marts
Business
Packages
Needed
Environment
Understand DSS technology
Understand relational databases
Know company data resources
Transformation and integration tools
Possess DBA skills
Install network and telecommunication
hardware and software
Possess front end software
Possess meta-data navigation tools
Two Approaches
Classical Enterprise Database
Typically contains operational data that
integrates information from all areas of
the organization.
Data Mart
Extracted and managerial support data
designed for departmental or EUC
applications
Data Warehouse vs
Operational Databases
Highly tuned
Real time Data
Detailed records
Current values
Accesses small
amounts of data in a
predictable manner
Flexible access
Consistent timing
Summarized as
appropriate
Historical
Access large amounts
of data in unexpected
ways
Separate Read Only
Database
Operational and decision support
processing are fundamentally different.
Flexible access
Consistent timing
Historical data
Environmental references
Common reference
Front End Access
Tailored access programs in user form,
usually client-server
General purpose GUI products (e.g.
Access, PowerBuilder)
Custom access routines
Development Cycles
Enterprise
Traditional SDLC
Requirements
Driven
Warehouse
Iterative
Development
Data
Driven
Data Warehouse
A database for departmental decision
support. Contains detail data brought in
through a feed and clean process.
Data extracted from source files
Data is integrated and transformed
Data resides in a read-only database
User access via a front end tool or
application
Data Warehouse
Design Basics
Must provide flexible access, “what if”
processing, and extensive reporting on
data subsets.
Can tables be scanned in a reasonable
time
Can indexes be added as needed.
If not, consider partitioning or summary
tables
Data Warehouse
Design Basics
Iterative Development with only part of
the warehouse built at one time
Used and modified with frequent user
feedback and modification
1. Implement; test bias &
completeness
2. Create and execute applications
3. Identify Requirements
4.Iterate
Data Development Roles
Data Analyst:
Works with user groups to understand
business and technical requirements.
Data Architect:
Manages company data. Must understand
business needs as well as data
implications.
Analyst-Architect
Communication
Requirements
Formal lines of communication
Data warehouse council that meets monthly
Analyst and architect retreat
Formally identify analysts and architects
Ensure comparable levels of personnel
Both understand business requirements
Architect goals available to analysts
Analyst strategy and issues available to
architects
Data Warehouse
Design Issues
Small systems are easier
to manage than large ones.
Adding attributes is easier than changing
or deleting them.
Data WarehouseDesign:
Proposed Changes
Will the change disrupt the current system?
Will the change add significantly more detail?
Will the change disregard existing data?
Will the change add a new table or change an
existing one?
What granularity?
Does it fit the current data model?
How much programming is required?
How many resources will it consume?
Data Warehouses
Extracted Data
Clean and feed.
Internal data
Summarized
Historical
Accuracy
Integration
External data
Generated data
Data Warehouse
Extraction Issues
Internal data often comes from separate
operational databases.
Reconcile formats
Remove intelligent codes
Verify accuracy
Unify time stamps
OLAP and MDBMS
Online Analytic Processing
Primarily relational models from which targeted
extractions can be developed
Complex for users
Multidimensional DBMS
Pre-modeled data cubes for efficient access
Complex for design
References
 V. Poe, Data Warehouse: Architecture is not Infrastructure, Database Programming and Design, July 1995
 J. Bischoff, Achieving Warehouse Success, Database Programming and Design, July, 1994.
 W. Inmon, For Managers Only - A Tale of Two Cycles: Iterative development in the information warehouse
environment, Database Programming and Design, Dec 1991.

More Related Content

Similar to Building a Data Warehouse for Decision Support

Numberate master presentation
Numberate master presentationNumberate master presentation
Numberate master presentationjkpeach
 
Estuate EDM Checklist
Estuate EDM ChecklistEstuate EDM Checklist
Estuate EDM ChecklistEstuate, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)Syaifuddin Ismail
 
Qiagram
QiagramQiagram
Qiagramjwppz
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data WarehousesMichael Lamont
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Chapter 2-data-warehousingppt2517 vero
Chapter 2-data-warehousingppt2517 veroChapter 2-data-warehousingppt2517 vero
Chapter 2-data-warehousingppt2517 veroangshuman2387
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Andrey Akulov
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.pptArbazAli27
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...IT Category Purchasing Managers Opportunity for Savings with Non Relational S...
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...Bill Kohnen
 

Similar to Building a Data Warehouse for Decision Support (20)

Unit 1
Unit 1Unit 1
Unit 1
 
Numberate master presentation
Numberate master presentationNumberate master presentation
Numberate master presentation
 
Estuate EDM Checklist
Estuate EDM ChecklistEstuate EDM Checklist
Estuate EDM Checklist
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)
 
Qiagram
QiagramQiagram
Qiagram
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Chapter 2-data-warehousingppt2517 vero
Chapter 2-data-warehousingppt2517 veroChapter 2-data-warehousingppt2517 vero
Chapter 2-data-warehousingppt2517 vero
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.ppt
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
DW 101
DW 101DW 101
DW 101
 
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...IT Category Purchasing Managers Opportunity for Savings with Non Relational S...
IT Category Purchasing Managers Opportunity for Savings with Non Relational S...
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Building a Data Warehouse for Decision Support

  • 1. Data Warehousing Building a database to support the decision making activities of a department or business unit
  • 2. Data Warehouse A read-only database for decision analysis –Subject Oriented –Integrated –Time variant –Nonvolatile consisting of time stamped operational and external data.
  • 3. Data Warehouse Purpose Identify problems in time to avoid them Locate opportunities you might otherwise miss
  • 4. Data Warehouse: New Approach An old idea with a new interest because of: Cheap Computing Power Special Purpose Hardware New Data Structures Intelligent Software
  • 5. Warehousing Problems Business Issues Data Quantity Data Accuracy Maintenance Ownership Cost
  • 6. Warehousing Problems Business Issues Database Issues DBMS Software Technology Complexity
  • 7. Business Issues Data Issues Analysis Issues User Interface Intelligent Processing Warehousing Problems
  • 9. Needed Environment Understand DSS technology Understand relational databases Know company data resources Transformation and integration tools Possess DBA skills Install network and telecommunication hardware and software Possess front end software Possess meta-data navigation tools
  • 10. Two Approaches Classical Enterprise Database Typically contains operational data that integrates information from all areas of the organization. Data Mart Extracted and managerial support data designed for departmental or EUC applications
  • 11. Data Warehouse vs Operational Databases Highly tuned Real time Data Detailed records Current values Accesses small amounts of data in a predictable manner Flexible access Consistent timing Summarized as appropriate Historical Access large amounts of data in unexpected ways
  • 12.
  • 13. Separate Read Only Database Operational and decision support processing are fundamentally different. Flexible access Consistent timing Historical data Environmental references Common reference
  • 14. Front End Access Tailored access programs in user form, usually client-server General purpose GUI products (e.g. Access, PowerBuilder) Custom access routines
  • 16. Data Warehouse A database for departmental decision support. Contains detail data brought in through a feed and clean process. Data extracted from source files Data is integrated and transformed Data resides in a read-only database User access via a front end tool or application
  • 17. Data Warehouse Design Basics Must provide flexible access, “what if” processing, and extensive reporting on data subsets. Can tables be scanned in a reasonable time Can indexes be added as needed. If not, consider partitioning or summary tables
  • 18. Data Warehouse Design Basics Iterative Development with only part of the warehouse built at one time Used and modified with frequent user feedback and modification 1. Implement; test bias & completeness 2. Create and execute applications 3. Identify Requirements 4.Iterate
  • 19. Data Development Roles Data Analyst: Works with user groups to understand business and technical requirements. Data Architect: Manages company data. Must understand business needs as well as data implications.
  • 20. Analyst-Architect Communication Requirements Formal lines of communication Data warehouse council that meets monthly Analyst and architect retreat Formally identify analysts and architects Ensure comparable levels of personnel Both understand business requirements Architect goals available to analysts Analyst strategy and issues available to architects
  • 21. Data Warehouse Design Issues Small systems are easier to manage than large ones. Adding attributes is easier than changing or deleting them.
  • 22. Data WarehouseDesign: Proposed Changes Will the change disrupt the current system? Will the change add significantly more detail? Will the change disregard existing data? Will the change add a new table or change an existing one? What granularity? Does it fit the current data model? How much programming is required? How many resources will it consume?
  • 23. Data Warehouses Extracted Data Clean and feed. Internal data Summarized Historical Accuracy Integration External data Generated data
  • 24. Data Warehouse Extraction Issues Internal data often comes from separate operational databases. Reconcile formats Remove intelligent codes Verify accuracy Unify time stamps
  • 25. OLAP and MDBMS Online Analytic Processing Primarily relational models from which targeted extractions can be developed Complex for users Multidimensional DBMS Pre-modeled data cubes for efficient access Complex for design
  • 26. References  V. Poe, Data Warehouse: Architecture is not Infrastructure, Database Programming and Design, July 1995  J. Bischoff, Achieving Warehouse Success, Database Programming and Design, July, 1994.  W. Inmon, For Managers Only - A Tale of Two Cycles: Iterative development in the information warehouse environment, Database Programming and Design, Dec 1991.