SlideShare a Scribd company logo
DATA WAREHOUSE
Data Warehouse 
• Pool of data to support decision making. 
• Structured to be available in ready to use form 
• Subject Oriented 
• Integrated 
• Time-variant 
• Nonvolatile 
• Additional characteristics like 
1.Web based 
2.Relational/multidimensional 
3.Client/Server 
4.Real time 
5.Include metadata
Types of Data warehouse 
DATA Mart 
• Dependent 
– Created from warehouse 
– Replicated 
• Functional subset of warehouse 
• Independent 
– Scaled down, less expensive version of data warehouse 
– Designed for a department or SBU 
– Organization may have multiple data marts 
• Difficult to integrate
• Operational DATA Stores: Provides a fairly 
recent form of customer information file(CIF) 
• Enterprise DATA Warehouses: Used across the 
enterprise for decision support 
• METADATA: Describes the structure of and 
meaning about data, contributing to their 
effective use.
Data warehousing process overview 
Major components 
• Data sources 
• Data extraction 
• Data loading 
• Comprehensive database 
• Metadata 
• Middleware tools
Data Warehousing Architectures 
• May have one or more tiers 
– Determined by warehouse, data acquisition (back 
end), and client (front end) 
• One tier, where all run on same platform, is rare 
• Two tier usually combines DSS engine (client) with 
warehouse 
– More economical 
• Three tier separates these functional parts
Architecture considerations 
• Which DBMS to use? 
• Parallel processing 
• Partitioning 
• Which data migration tools be used? 
• What tools for data retrieval and analysis?
Alternative Architectures for data 
warehousing
Architecture Selection Factors 
• Information interdependence 
• Senior management Info needs 
• Urgency for a DW 
• Nature of end user tasks 
• Constraints on resources 
• Strategic view 
• Compatibility with existing systems 
• Ability of in-house IT staff 
• Technical and Political factors
Enterprise Data Warehouse
Data Integration, Extraction And Load 
process 
1.DATA INTEGRATION 
Comprises three major processes 
• Data Access: ability to access & extract data 
from any data source 
• Data federation: Integration of business views 
across multiple data store 
• Change capture: Based on the identification, 
capture, and delivery of the changes made to 
enterprise data source.
2.Extraction, Transformation And Load(ETL) 
• Is an integral component in any data-centric 
project. 
• ETL consists: 
Extraction-From all relevant sources 
Transformation-Converting extracted data in the 
form so it can place in data warehouse or 
another database 
Load- Inserting the data in the data warehouse.
ETL Process 
Transient 
Data 
source Data 
Warehouse 
Data 
Mart 
Packaged 
application 
Legacy 
system 
Extract 
Other 
Internal 
applications 
Transform Cleanse Load
Benefits of Data Warehouse 
• Allows extensive analysis in numerous ways. 
• A consolidated view of corporate data. 
• Better and more timely information. 
• Enhance system performance. 
• Simplification of data access. 
• Enhance business knowledge, enhance 
customer service and satisfaction, facilitate 
decision making.
Assignment 
• Data warehousing vendors? 
• Data warehousing case study found on the 
internet.
Data Warehouse development 
Approaches 
The Inmon Model: The EDW Approach 
• Emphasizes top-down development 
• Employing established database development 
methodologies and tools 
The Kimball Model: The Data Mart Approach 
• Plan big, build small 
• Subject oriented or department oriented 
• Focus on the requests of a specific department.
Data Warehouse Structure 
(The Star Schema)
Successful Implementation of Data 
warehouse 
• Establishment of service-level agreements and data-refresh 
requirements. 
• Identification of data sources and their governance 
policies. 
• Data quality planning & model designing. 
• ETL tool selection. 
• Relational database software and platform selection. 
• Data transport and data conversion. 
• Reconciliation process 
• End-user support
Issues in implementation of data 
warehouse 
• Starting with the wrong sponsorship chain. 
• Setting expectation that you cannot meet and 
frustrating executives at the moment of truth. 
• Engaging in politically native behavior. 
• Loading the warehouse with information just 
because it is available. 
• Believing that data warehousing database design 
is the same as transactional database design. 
Continue……..
• Choosing a data warehouse manager who is 
technology oriented rather than user oriented 
• Focusing on traditional internal record-oriented 
data and ignoring the value of external data of 
text, image, and perhaps, sound and video. 
• Delivering data with overlapping and confusing 
definitions. 
• Believing promise of performance, capacity and 
scalability. 
• Believing that your problem are over when the 
data warehouse is up and running.
Risks in Data Warehouse Projects 
• No mission or objective 
• Quality of source data 
unknown 
• Skills not in place 
• Inadequate budget 
• Lack of supporting software 
• Source data not understood 
• Weak sponsor 
• Users not computer literate 
• Geographically distributed 
environment 
• Unrealistic user expectations 
• Architectural and design risks 
• Scope creep and changing 
requirements 
• Vendors out of control 
• Multiple platforms 
• Key people leaving project 
• Loss of the sponsor 
• Too much new technology 
• Having to fix an operational 
system 
• Team geography and 
language culture
Massive Data Warehouse And 
Scalability 
• Data warehouse needs scalability. 
• Good scalability means: queries and other 
data access functions grow ideally with the 
size of warehouse. 
• Specialized methods have been developed to 
create scalable data warehouse. 
• Scalability is difficult in managing hundreds of 
terabytes.
Issues pertaining to scalability 
• The amount of data in warehouse. 
• How quickly the warehouse is expected to 
grow. 
• The number of concurrent users. 
• The complexity of user queries.
Real-Time Data warehousing 
• Also knows as active data warehousing. 
• Process of loading & providing data via the 
data warehouse. 
• Evolved from EDW (Enterprise Data Warehousing) 
concept. 
• Allows information-based decision making at 
finger tips. 
• Positively affect almost all aspects of customer 
service, SCM, logistics.
Comparison between Traditional And 
Active Data Warehousing Environment 
Traditional Data Warehouse 
Environment 
• Strategic decisions only 
• Result sometimes hard to 
measure 
• Moderate user concurrency 
• Highly restrictive reporting 
used to confirm or check 
existing processes and 
patterns. 
• Power users, knowledge 
workers, internal users. 
Active Data Warehouse 
Environment 
• Strategic and tactical decision 
• Result measured with 
operations 
• High number of users accessing 
simultaneously 
• Flexible ad hoc reporting, as well 
as machine-assisted modeling to 
discover new hypotheses. 
• Operational staffs, call centers, 
external users.
Data Warehouse Administration 
• Due to huge size, data warehouse requires 
strong monitoring. 
• A data warehouse administrator(DWA) should 
posses following features- 
1. Should be familiar with high performance software, 
hardware, and networking tech. 
2. Should familiar with decision making process. 
3. Significant to keep the existing requirement and 
capabilities of data warehouse. 
4. Must posses excellent communication skills.
Data Warehouse Security issues 
• Security and privacy of information is significant 
concern. 
• Companies must create effective and flexible 
security procedures. 
• Effective security in data warehouse focus on: 
1. Establishing effective corporate and security policies and 
procedures. 
2. Implementing logical security procedures and techniques to 
restrict access. 
3. Limiting physical access to the data center environment. 
4. Establishing an effective internal control review process with 
an emphasis on security and privacy.

More Related Content

What's hot

Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
Ankita Dubey
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 
data warehousing
data warehousingdata warehousing
data warehousing
Jagnesh Chawla
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousing
Zahra Mansoori
 
Business intelligence an Overview
Business intelligence an OverviewBusiness intelligence an Overview
Business intelligence an Overview
Zahra Mansoori
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
Advance Saraswati Prakashan Pvt Ltd
 
Chap005
Chap005Chap005
Enterprise Architecture
Enterprise Architecture Enterprise Architecture
Enterprise Architecture
gdavie
 
Chap004
Chap004Chap004
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
A P
 
Unit 1
Unit 1Unit 1
Unit 1
DrPrabu M
 
Data warehouse
Data warehouseData warehouse
Data warehouse
amna alhabib
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecture
samaksh1982
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
Abdul Aslam
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Jason S
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
Girish Dhareshwar
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
Datawarehouse
DatawarehouseDatawarehouse
Data warehouse
Data warehouseData warehouse
Data warehouse
sudhir Pawar
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven Approach
Timothy Valihora
 

What's hot (20)

Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousing
 
Business intelligence an Overview
Business intelligence an OverviewBusiness intelligence an Overview
Business intelligence an Overview
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
 
Chap005
Chap005Chap005
Chap005
 
Enterprise Architecture
Enterprise Architecture Enterprise Architecture
Enterprise Architecture
 
Chap004
Chap004Chap004
Chap004
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
Unit 1
Unit 1Unit 1
Unit 1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecture
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven Approach
 

Similar to Data warehouseold

Data warehouse
Data warehouseData warehouse
Data warehouse
Shwetabh Jaiswal
 
Datawarehouse org
Datawarehouse orgDatawarehouse org
Datawarehouse org
Shwetabh Jaiswal
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
Vibrant Technologies & Computers
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL Testing
Vibrant Event
 
Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing
Vibrant Technologies & Computers
 
ETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptxETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptx
karanamlakshminarasa
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
RafiulHasan19
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
RahulSingh986955
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
Saurabh K. Gupta
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
Amit Sarkar
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
Ashish Kumar Thakur
 
dwproblems.pptx
dwproblems.pptxdwproblems.pptx
dwproblems.pptx
manojMarwah
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
ssuser7fc7eb
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
AAKANKSHA JAIN
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Denodo
 
Data Mart Lake Ware.pptx
Data Mart Lake Ware.pptxData Mart Lake Ware.pptx
Data Mart Lake Ware.pptx
BalasundaramSr
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 

Similar to Data warehouseold (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Datawarehouse org
Datawarehouse orgDatawarehouse org
Datawarehouse org
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL Testing
 
Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing
 
ETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptxETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptx
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
dwproblems.pptx
dwproblems.pptxdwproblems.pptx
dwproblems.pptx
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Data Mart Lake Ware.pptx
Data Mart Lake Ware.pptxData Mart Lake Ware.pptx
Data Mart Lake Ware.pptx
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 

More from Shwetabh Jaiswal

The essentials of business intelligence
The essentials of business intelligenceThe essentials of business intelligence
The essentials of business intelligence
Shwetabh Jaiswal
 
The essentials of business intelligence
The essentials of business intelligenceThe essentials of business intelligence
The essentials of business intelligence
Shwetabh Jaiswal
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
Shwetabh Jaiswal
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
Shwetabh Jaiswal
 
Dw case study
Dw case studyDw case study
Dw case study
Shwetabh Jaiswal
 
Dss case study
Dss case studyDss case study
Dss case study
Shwetabh Jaiswal
 
Decision support systems and business intelligence
Decision support systems and business intelligenceDecision support systems and business intelligence
Decision support systems and business intelligence
Shwetabh Jaiswal
 
Decision support systems and business intelligence
Decision support systems and business intelligenceDecision support systems and business intelligence
Decision support systems and business intelligence
Shwetabh Jaiswal
 
Decision making systems
Decision making systemsDecision making systems
Decision making systems
Shwetabh Jaiswal
 
Decision making systems
Decision making systemsDecision making systems
Decision making systems
Shwetabh Jaiswal
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
Shwetabh Jaiswal
 
Bi case study
Bi case studyBi case study
Bi case study
Shwetabh Jaiswal
 

More from Shwetabh Jaiswal (12)

The essentials of business intelligence
The essentials of business intelligenceThe essentials of business intelligence
The essentials of business intelligence
 
The essentials of business intelligence
The essentials of business intelligenceThe essentials of business intelligence
The essentials of business intelligence
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
 
Dw case study
Dw case studyDw case study
Dw case study
 
Dss case study
Dss case studyDss case study
Dss case study
 
Decision support systems and business intelligence
Decision support systems and business intelligenceDecision support systems and business intelligence
Decision support systems and business intelligence
 
Decision support systems and business intelligence
Decision support systems and business intelligenceDecision support systems and business intelligence
Decision support systems and business intelligence
 
Decision making systems
Decision making systemsDecision making systems
Decision making systems
 
Decision making systems
Decision making systemsDecision making systems
Decision making systems
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
 
Bi case study
Bi case studyBi case study
Bi case study
 

Recently uploaded

Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 

Recently uploaded (20)

Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 

Data warehouseold

  • 2. Data Warehouse • Pool of data to support decision making. • Structured to be available in ready to use form • Subject Oriented • Integrated • Time-variant • Nonvolatile • Additional characteristics like 1.Web based 2.Relational/multidimensional 3.Client/Server 4.Real time 5.Include metadata
  • 3. Types of Data warehouse DATA Mart • Dependent – Created from warehouse – Replicated • Functional subset of warehouse • Independent – Scaled down, less expensive version of data warehouse – Designed for a department or SBU – Organization may have multiple data marts • Difficult to integrate
  • 4. • Operational DATA Stores: Provides a fairly recent form of customer information file(CIF) • Enterprise DATA Warehouses: Used across the enterprise for decision support • METADATA: Describes the structure of and meaning about data, contributing to their effective use.
  • 5. Data warehousing process overview Major components • Data sources • Data extraction • Data loading • Comprehensive database • Metadata • Middleware tools
  • 6.
  • 7. Data Warehousing Architectures • May have one or more tiers – Determined by warehouse, data acquisition (back end), and client (front end) • One tier, where all run on same platform, is rare • Two tier usually combines DSS engine (client) with warehouse – More economical • Three tier separates these functional parts
  • 8.
  • 9. Architecture considerations • Which DBMS to use? • Parallel processing • Partitioning • Which data migration tools be used? • What tools for data retrieval and analysis?
  • 10. Alternative Architectures for data warehousing
  • 11. Architecture Selection Factors • Information interdependence • Senior management Info needs • Urgency for a DW • Nature of end user tasks • Constraints on resources • Strategic view • Compatibility with existing systems • Ability of in-house IT staff • Technical and Political factors
  • 13. Data Integration, Extraction And Load process 1.DATA INTEGRATION Comprises three major processes • Data Access: ability to access & extract data from any data source • Data federation: Integration of business views across multiple data store • Change capture: Based on the identification, capture, and delivery of the changes made to enterprise data source.
  • 14. 2.Extraction, Transformation And Load(ETL) • Is an integral component in any data-centric project. • ETL consists: Extraction-From all relevant sources Transformation-Converting extracted data in the form so it can place in data warehouse or another database Load- Inserting the data in the data warehouse.
  • 15. ETL Process Transient Data source Data Warehouse Data Mart Packaged application Legacy system Extract Other Internal applications Transform Cleanse Load
  • 16. Benefits of Data Warehouse • Allows extensive analysis in numerous ways. • A consolidated view of corporate data. • Better and more timely information. • Enhance system performance. • Simplification of data access. • Enhance business knowledge, enhance customer service and satisfaction, facilitate decision making.
  • 17. Assignment • Data warehousing vendors? • Data warehousing case study found on the internet.
  • 18. Data Warehouse development Approaches The Inmon Model: The EDW Approach • Emphasizes top-down development • Employing established database development methodologies and tools The Kimball Model: The Data Mart Approach • Plan big, build small • Subject oriented or department oriented • Focus on the requests of a specific department.
  • 19. Data Warehouse Structure (The Star Schema)
  • 20. Successful Implementation of Data warehouse • Establishment of service-level agreements and data-refresh requirements. • Identification of data sources and their governance policies. • Data quality planning & model designing. • ETL tool selection. • Relational database software and platform selection. • Data transport and data conversion. • Reconciliation process • End-user support
  • 21. Issues in implementation of data warehouse • Starting with the wrong sponsorship chain. • Setting expectation that you cannot meet and frustrating executives at the moment of truth. • Engaging in politically native behavior. • Loading the warehouse with information just because it is available. • Believing that data warehousing database design is the same as transactional database design. Continue……..
  • 22. • Choosing a data warehouse manager who is technology oriented rather than user oriented • Focusing on traditional internal record-oriented data and ignoring the value of external data of text, image, and perhaps, sound and video. • Delivering data with overlapping and confusing definitions. • Believing promise of performance, capacity and scalability. • Believing that your problem are over when the data warehouse is up and running.
  • 23. Risks in Data Warehouse Projects • No mission or objective • Quality of source data unknown • Skills not in place • Inadequate budget • Lack of supporting software • Source data not understood • Weak sponsor • Users not computer literate • Geographically distributed environment • Unrealistic user expectations • Architectural and design risks • Scope creep and changing requirements • Vendors out of control • Multiple platforms • Key people leaving project • Loss of the sponsor • Too much new technology • Having to fix an operational system • Team geography and language culture
  • 24. Massive Data Warehouse And Scalability • Data warehouse needs scalability. • Good scalability means: queries and other data access functions grow ideally with the size of warehouse. • Specialized methods have been developed to create scalable data warehouse. • Scalability is difficult in managing hundreds of terabytes.
  • 25. Issues pertaining to scalability • The amount of data in warehouse. • How quickly the warehouse is expected to grow. • The number of concurrent users. • The complexity of user queries.
  • 26. Real-Time Data warehousing • Also knows as active data warehousing. • Process of loading & providing data via the data warehouse. • Evolved from EDW (Enterprise Data Warehousing) concept. • Allows information-based decision making at finger tips. • Positively affect almost all aspects of customer service, SCM, logistics.
  • 27. Comparison between Traditional And Active Data Warehousing Environment Traditional Data Warehouse Environment • Strategic decisions only • Result sometimes hard to measure • Moderate user concurrency • Highly restrictive reporting used to confirm or check existing processes and patterns. • Power users, knowledge workers, internal users. Active Data Warehouse Environment • Strategic and tactical decision • Result measured with operations • High number of users accessing simultaneously • Flexible ad hoc reporting, as well as machine-assisted modeling to discover new hypotheses. • Operational staffs, call centers, external users.
  • 28. Data Warehouse Administration • Due to huge size, data warehouse requires strong monitoring. • A data warehouse administrator(DWA) should posses following features- 1. Should be familiar with high performance software, hardware, and networking tech. 2. Should familiar with decision making process. 3. Significant to keep the existing requirement and capabilities of data warehouse. 4. Must posses excellent communication skills.
  • 29. Data Warehouse Security issues • Security and privacy of information is significant concern. • Companies must create effective and flexible security procedures. • Effective security in data warehouse focus on: 1. Establishing effective corporate and security policies and procedures. 2. Implementing logical security procedures and techniques to restrict access. 3. Limiting physical access to the data center environment. 4. Establishing an effective internal control review process with an emphasis on security and privacy.