SlideShare a Scribd company logo
1 of 27
Data Warehousing
Hu Yan
huy@cs.tut.fi
 Outline
• What is data warehousing
• The benefit of data warehousing
• Differences between OLTP and data warehousing
• The architecture of data warehouse
• The main components
• Data flows
• Tools and technologies
• Integration
• The importance of managing meta-data
• Data marts
 What is data warehousing?
• data warehousing is subject-oriented, integrated,
time-variant, and non-volatile collection of data in
support of management’s decision-making
process.
• a data warehouse is data management and data
analysis
• data webhouse is a distributed data warehouse
that is implement over the web with no central
data repository
• goal: is to integrate enterprise wide corporate data
into a single reository from which users can easily
run queries
 What is data warehousing?
• Subject-orientedWH is organized around the major subjects of the
enterprise..rather than the major application areas.. This is reflected in the
need to store decision-support data rather than application-oriented data
• Integratedbecause the source data come together from different enterprise-
wide applications systems. The source data is often inconsistent using..The
integrated data source must be made consistent to present a unified view of the
data to the users
• Time-variantthe source data in the WH is only accurate and valid at some
point in time or over some time interval. The time-variance of the data
warehouse is also shown in the extended time that the data is held, the implicit
or explicit association of time with all data, and the fact that the data
represents a series of snapshots
• Non-volatiledata is not update in real time but is refresh from OS on a
regular basis. New data is always added as a supplement to DB, rather than
replacement. The DB continually absorbs this new data, incrementally
integrating it with previous data
 The benefits of data
warehousing
• The potential benefits of data warehousing
are high returns on investment..
• substantial competitive advantage..
• increased productivity of corporate
decision-makers..
 The difference bewteen OLTP
and data warehousing
• A DBMS built for online transaction
processing (OLTP) is generally regarded as
unsuitable for data warehousing because
each system is designed with a differing set
of requirements in mind
• example: OLTP systems are design to maximize the transaction
processing capacity, while data warehouses are designed to support ad
hoc query processing
comparision of OLTP systems and data
warehousing system
OLTP systems Data warehousing
systems
Hold current data
Stores detailed data
Data is dynamic
Repetitive processing
High level of transaction throughput
Predictable pattern of usage
Transaction-driven
Application-orented
Supports day-to-day decisions
Serves large number of clerical/operation
users
Holds historical data
Stores detailed, lightly, and highly
summarized data
Data is largely static
Ad hoc, unstructured, and heuristic
processing
Medium to how level of transaction
throughput
Unpredictable pattern of usage
Analysis driven
Subject-oriented
supports strategic decisions
Serves relatively how number of managerial
users
 Problems
• Underestimation of resources for data loading
• Hidden problems with source systems
• Required data not captured
• Increased end-user demands
• Data homogenization
• High demand for resources
• Data ownership
• High maintenance
• Long-duration projects
• Complexity of integration
Operational
data source1
 The architecture
Query Manage
Warehouse Manager
DBMS
Operational
data source 2
Meta-data
High
summarized data
Detailed data
Lightly
summarized
data
Operational
data store (ods)
Operational
data source n
Archive/backup
data
Load Manager
Data mining
OLAP(online
analytical processing) tools
Reporting, query,
application development,
and EIS(executive
information system) tools
End-user
access tools
Typical architecture of a data warehouse
Operational data store (ODS)
 The main components
• Operational data sourcesfor the DW is supplied from
mainframe operational data held in first generation hierarchical and
network databases, departmental data held in proprietary file systems,
private data held on workstaions and private serves and external
systems such as the Internet, commercially available DB, or DB
assoicated with and organization’s suppliers or customers
• Operational datastore(ODS)is a repository of current
and integrated operational data used for analysis. It is often structured
and supplied with data in the same way as the data warehouse, but
may in fact simply act as a staging area for data to be moved into the
warehouse
 The main components
• load manageralso called the frontend component, it performance
all the operations associated with the extraction and loading of data
into the warehouse. These operations include simple transformations
of the data to prepare the data for entry into the warehouse
• warehouse managerperforms all the operations associated with
the management of the data in the warehouse. The operations
performed by this component include analysis of data to ensure
consistency, transformation and merging of source data, creation of
indexes and views, generation of denormalizations and aggregations,
and archiving and backing-up data
 The main components
• query manageralso called backend component, it performs all
the operations associated with the management of user queries. The
operations performed by this component include directing queries to
the appropriate tables and scheduling the execution of queries
• detailed, lightly and lightly summarized
data,archive/backup data
• meta-data
• end-user access toolscan be categorized into five main groups:
data reporting and query tools, application development tools,
executive information system (EIS) tools, online analytical processing
(OLAP) tools, and data mining tools
 Data flows
• Inflow- The processes associated with the extraction, cleansing, and loading of
the data from the source systems into the data warehouse.
• upflow- The process associated with adding value to the data in the warehouse
through summarizing, packaging , packaging, and distribution of the data
• downflow-The processes associated with archiving and backing-up of data in
the warehouse
• outflow- The process associated with making the data availabe to the end-users
• Meta-flow-The processes associated with the management of the meta-data
Operational
data source1
Warehouse Manager
DBMS
Meta-data High
summarized data
Detailed data
Lightly
summarized
data
Operational
data store (ods)
Operational
data source n
Archive/backup
data
Load
Manager
Data mining tools
OLAP (online
analytical processing)
tools
End-user
access tools
Information flows of a data warehouse
Reporting, query,application
development, and EIS (executive
information system) tools
Downflow
Inflow
Meta-flow
Upflow Query Manage
Outflow
Warehouse Manager
 Tools and Technologies
• The critical steps in the construction of a data
warehouse:
a. Extraction
b. Cleansing
c. Transformation
• after the critical steps, loading the results into
target system can be carried out either by separate
products, or by a single, categories:
• code generators
• database data replication tools
• dynamic transformation engines
 Data Warehouse
DBSM(integration)
• due to the maturity of such products, most
relational databases will integrate predictably with
other types of software
• The reqirements for data warehose RDBMS
• Load performance
• Load processing
• Data quality management
• Query perfomance
• Terabyte scalability
• Mass user scalability
• Networked data warehouse
• Warehouse administration
• Integrated dimensional analysis
• Advanced query funtionlity
 The importance of managing
meta-data(integration)
• The integration of meta-data, that is ”data about data”
• Meta-data is used for a variety of purposes and the management of it is
a critical issue in achieving a fully integrated data warehouse
• The major purpose of meta-data is to show the pathway back to where
the data began, so that the warehouse administrators know the history
of any item in the warehouse
• The meta-data associated with data transformation and loading must
describe the source data and any changes that were made to the data
• The meta-data associated with data management describes the data as
it is stored in the warehouse
• The meta-data is required by the query manager to generate
appropriate queries, also is associated with the user of queries
• The major integration issue is how to synchronize the various types of
meta-data use throughout the data warehouse. The challenge is to
synchronize meta-data between different products from different
vendors using different meta-data stores
• Two major standards for meta-data and modeling in the areas of data
warehousing and component-based development-MDC(Meta Data
Coalition) and OMG(Object Management Group)
 Administration and
Management Tools
• a data warehouse requires tools to support the
administration and management of such complex
enviroment.
• for the various types of meta-data and the day-to-day
operations of the data warehouse, the administration and
management tools must be capable of supporting those
tasks:
• monitoring data loading from multiple sources
• data quality and integrity checks
• managing and updating meta-data
• monitoring database performance to ensure efficient query response
times and resource utilization
• auditing data warehouse usage to provide user chargeback information
• replicating, subsetting, and distributing data
• maintaining effient data storage management
• purging data;
• archiving and backing-up data
• implementing recovery following failure
• security management
 Data mart
• data mart a subset of a data warehouse
that supports the requirements of particular
department or business function
• The characteristics that differentiate data
marts and data warehouses include:
• a data mart focuses on only the requirements of users associated with
one department or business function
• data marts do not normally contain detailed operational data, unlike
data warehouses
• as data marts contain less data compared with data warehouses, data
marts are more easily understood and navigated
Operational
data source1
Warehouse Manager
DBMS
Operational
data source 2
Meta-data
High
summarized data
Detailed data
Lightly
summarized
data
Operational
data store (ods)
Operational
data source n
Archive/backup
data
Load
Manager
Data mining
OLAP(online
analytical processing) tools
Reporting, query,application development,
and EIS(executive information system) tools
End-user
access tools
Typical data warehouse adn data mart architecture
Operational data store (ODS)
Query
Manage
summarized
data(Relational database)
Summarized data
(Multi-dimension database)
Data Mart
(First Tier)
(Third Tier)
(Second Tier)
Warehouse Manager
Reasons for creating a data mart
• To give users access to the data they need to analyze most often
• To provide data in a form that matches the collective view of the data
by a group of users in a department or business function
• To improve end-user response time due to the reduction in the volume
of data to be accessed
• To provide appropriately structured data as ditated by the requirements
of end-user access tools
• Normally use less data so tasks such as data cleansing, loading,
transformation, and integration are far easier, and hence implementing
and setting up a data mart is simpler than establishing a corporate data
warehouse
• The cost of implementing data marts is normally less than that
required to establish a data warehouse
• The potential users of a data mart are more clearly defined and can be
more easily targeted to obtain support for a data mart project rather
than a corporate data warehouse project
data marts issues
• data mart functionalitythe capabilities of data marts
have increased with the growth in their popularity
• data mart sizethe performance deteriorates as data
marts grow in size, so need to reduce the size of data marts
to gain improvements in performance
• data mart load performancetwo critical
components: end-user response time and data loading
performanceto increment DB updating so that only cells
affected by the change are updated and not the entire
MDDB structure
• users’ access to data in multiple martsone approach
is to replicate data between different data marts or, alternatively, build
virtual data martit is views of several physical data marts or the
corporate data warehouse tailored to meet the requirements of
specific groups of users
• data mart internet/intranet accessit’s products sit
between a web server and the data analysis product.Internet/intranet offers
users low-cost access to data marts and the data WH using web browsers.
• data mart administrationorganization can not easily perform
administration of multiple data marts, giving rise to issues such as data mart
versioning, data and meta-data consistency and integrity, enterprise-wide
security, and performance tuning . Data mart administrative tools are
commerciallly available
• data mart installationdata marts are becoming increasingly
complex to build. Vendors are offering products referred to as ”data
mart in a box” that provide a low-cost source of data mart tools

More Related Content

What's hot

Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Vibrant Technologies & Computers
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEyad Manna
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingVibrant Event
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biA P
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDhilsath Fathima
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecturesamaksh1982
 

What's hot (20)

Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Data-ware Housing
Data-ware HousingData-ware Housing
Data-ware Housing
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
OLAP technology
OLAP technologyOLAP technology
OLAP technology
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousing
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecture
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data ware house
Data ware houseData ware house
Data ware house
 

Viewers also liked

Cornell Clubs and Groups Presentation
Cornell Clubs and Groups PresentationCornell Clubs and Groups Presentation
Cornell Clubs and Groups PresentationHoward Greenstein
 
Bonamassa New York Lookbook FW 2011
Bonamassa New York Lookbook FW 2011Bonamassa New York Lookbook FW 2011
Bonamassa New York Lookbook FW 2011Steven Bonamassa
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your mainMadzipan
 
LaMarque Collection lookbook-
LaMarque Collection lookbook-LaMarque Collection lookbook-
LaMarque Collection lookbook-Steven Bonamassa
 
LaMarque Collection Mens Lookbook SS 2014
LaMarque Collection Mens Lookbook SS 2014LaMarque Collection Mens Lookbook SS 2014
LaMarque Collection Mens Lookbook SS 2014Steven Bonamassa
 
Data Encryption Standard (DES)
Data Encryption Standard (DES)Data Encryption Standard (DES)
Data Encryption Standard (DES)Amir Masinaei
 
El segundo franquismo (1959-1975)
El segundo franquismo (1959-1975)El segundo franquismo (1959-1975)
El segundo franquismo (1959-1975)Madelman68
 
MONSTRUOS MITOLÓGICOS.
MONSTRUOS MITOLÓGICOS.MONSTRUOS MITOLÓGICOS.
MONSTRUOS MITOLÓGICOS.Josegomez15
 
Host modulation therapy
Host modulation therapyHost modulation therapy
Host modulation therapyVijay Apparaju
 
Criteris geo juliol2014
Criteris geo juliol2014Criteris geo juliol2014
Criteris geo juliol2014Txema Gs
 
Criteris geo pau_juny2014
Criteris geo pau_juny2014Criteris geo pau_juny2014
Criteris geo pau_juny2014Txema Gs
 

Viewers also liked (15)

Cornell Clubs and Groups Presentation
Cornell Clubs and Groups PresentationCornell Clubs and Groups Presentation
Cornell Clubs and Groups Presentation
 
Tralukya Hazarika
Tralukya HazarikaTralukya Hazarika
Tralukya Hazarika
 
Bonamassa New York Lookbook FW 2011
Bonamassa New York Lookbook FW 2011Bonamassa New York Lookbook FW 2011
Bonamassa New York Lookbook FW 2011
 
Payroll
PayrollPayroll
Payroll
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your main
 
resume.saurabh
resume.saurabhresume.saurabh
resume.saurabh
 
LaMarque Collection lookbook-
LaMarque Collection lookbook-LaMarque Collection lookbook-
LaMarque Collection lookbook-
 
UP-PR-MarCom-Anilkumar R
UP-PR-MarCom-Anilkumar RUP-PR-MarCom-Anilkumar R
UP-PR-MarCom-Anilkumar R
 
LaMarque Collection Mens Lookbook SS 2014
LaMarque Collection Mens Lookbook SS 2014LaMarque Collection Mens Lookbook SS 2014
LaMarque Collection Mens Lookbook SS 2014
 
Data Encryption Standard (DES)
Data Encryption Standard (DES)Data Encryption Standard (DES)
Data Encryption Standard (DES)
 
El segundo franquismo (1959-1975)
El segundo franquismo (1959-1975)El segundo franquismo (1959-1975)
El segundo franquismo (1959-1975)
 
MONSTRUOS MITOLÓGICOS.
MONSTRUOS MITOLÓGICOS.MONSTRUOS MITOLÓGICOS.
MONSTRUOS MITOLÓGICOS.
 
Host modulation therapy
Host modulation therapyHost modulation therapy
Host modulation therapy
 
Criteris geo juliol2014
Criteris geo juliol2014Criteris geo juliol2014
Criteris geo juliol2014
 
Criteris geo pau_juny2014
Criteris geo pau_juny2014Criteris geo pau_juny2014
Criteris geo pau_juny2014
 

Similar to Datawarehousing

Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptRafiulHasan19
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingsumit621
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehousessuser7fc7eb
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxParnalSatle
 
data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...aasifkuchey85
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingVibrant Event
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data WarehousingAAKANKSHA JAIN
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxvipush1
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptxAnusuya123
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database managementOnline
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Data Mart Lake Ware.pptx
Data Mart Lake Ware.pptxData Mart Lake Ware.pptx
Data Mart Lake Ware.pptxBalasundaramSr
 

Similar to Datawarehousing (20)

Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
Datawarehouse org
Datawarehouse orgDatawarehouse org
Datawarehouse org
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...data warehousing need and characteristics. types of data w data warehouse arc...
data warehousing need and characteristics. types of data w data warehouse arc...
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL Testing
 
Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing Datastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing
 
ETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptxETL-Datawarehousing.ppt.pptx
ETL-Datawarehousing.ppt.pptx
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Data Mart Lake Ware.pptx
Data Mart Lake Ware.pptxData Mart Lake Ware.pptx
Data Mart Lake Ware.pptx
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 

Datawarehousing

  • 2.  Outline • What is data warehousing • The benefit of data warehousing • Differences between OLTP and data warehousing • The architecture of data warehouse • The main components • Data flows • Tools and technologies • Integration • The importance of managing meta-data • Data marts
  • 3.  What is data warehousing? • data warehousing is subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management’s decision-making process. • a data warehouse is data management and data analysis • data webhouse is a distributed data warehouse that is implement over the web with no central data repository • goal: is to integrate enterprise wide corporate data into a single reository from which users can easily run queries
  • 4.  What is data warehousing? • Subject-orientedWH is organized around the major subjects of the enterprise..rather than the major application areas.. This is reflected in the need to store decision-support data rather than application-oriented data • Integratedbecause the source data come together from different enterprise- wide applications systems. The source data is often inconsistent using..The integrated data source must be made consistent to present a unified view of the data to the users • Time-variantthe source data in the WH is only accurate and valid at some point in time or over some time interval. The time-variance of the data warehouse is also shown in the extended time that the data is held, the implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots • Non-volatiledata is not update in real time but is refresh from OS on a regular basis. New data is always added as a supplement to DB, rather than replacement. The DB continually absorbs this new data, incrementally integrating it with previous data
  • 5.  The benefits of data warehousing • The potential benefits of data warehousing are high returns on investment.. • substantial competitive advantage.. • increased productivity of corporate decision-makers..
  • 6.  The difference bewteen OLTP and data warehousing • A DBMS built for online transaction processing (OLTP) is generally regarded as unsuitable for data warehousing because each system is designed with a differing set of requirements in mind • example: OLTP systems are design to maximize the transaction processing capacity, while data warehouses are designed to support ad hoc query processing
  • 7. comparision of OLTP systems and data warehousing system OLTP systems Data warehousing systems Hold current data Stores detailed data Data is dynamic Repetitive processing High level of transaction throughput Predictable pattern of usage Transaction-driven Application-orented Supports day-to-day decisions Serves large number of clerical/operation users Holds historical data Stores detailed, lightly, and highly summarized data Data is largely static Ad hoc, unstructured, and heuristic processing Medium to how level of transaction throughput Unpredictable pattern of usage Analysis driven Subject-oriented supports strategic decisions Serves relatively how number of managerial users
  • 8.  Problems • Underestimation of resources for data loading • Hidden problems with source systems • Required data not captured • Increased end-user demands • Data homogenization • High demand for resources • Data ownership • High maintenance • Long-duration projects • Complexity of integration
  • 9. Operational data source1  The architecture Query Manage Warehouse Manager DBMS Operational data source 2 Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining OLAP(online analytical processing) tools Reporting, query, application development, and EIS(executive information system) tools End-user access tools Typical architecture of a data warehouse Operational data store (ODS)
  • 10.  The main components • Operational data sourcesfor the DW is supplied from mainframe operational data held in first generation hierarchical and network databases, departmental data held in proprietary file systems, private data held on workstaions and private serves and external systems such as the Internet, commercially available DB, or DB assoicated with and organization’s suppliers or customers • Operational datastore(ODS)is a repository of current and integrated operational data used for analysis. It is often structured and supplied with data in the same way as the data warehouse, but may in fact simply act as a staging area for data to be moved into the warehouse
  • 11.  The main components • load manageralso called the frontend component, it performance all the operations associated with the extraction and loading of data into the warehouse. These operations include simple transformations of the data to prepare the data for entry into the warehouse • warehouse managerperforms all the operations associated with the management of the data in the warehouse. The operations performed by this component include analysis of data to ensure consistency, transformation and merging of source data, creation of indexes and views, generation of denormalizations and aggregations, and archiving and backing-up data
  • 12.  The main components • query manageralso called backend component, it performs all the operations associated with the management of user queries. The operations performed by this component include directing queries to the appropriate tables and scheduling the execution of queries • detailed, lightly and lightly summarized data,archive/backup data • meta-data • end-user access toolscan be categorized into five main groups: data reporting and query tools, application development tools, executive information system (EIS) tools, online analytical processing (OLAP) tools, and data mining tools
  • 13.  Data flows • Inflow- The processes associated with the extraction, cleansing, and loading of the data from the source systems into the data warehouse. • upflow- The process associated with adding value to the data in the warehouse through summarizing, packaging , packaging, and distribution of the data • downflow-The processes associated with archiving and backing-up of data in the warehouse • outflow- The process associated with making the data availabe to the end-users • Meta-flow-The processes associated with the management of the meta-data
  • 14. Operational data source1 Warehouse Manager DBMS Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining tools OLAP (online analytical processing) tools End-user access tools Information flows of a data warehouse Reporting, query,application development, and EIS (executive information system) tools Downflow Inflow Meta-flow Upflow Query Manage Outflow Warehouse Manager
  • 15.  Tools and Technologies • The critical steps in the construction of a data warehouse: a. Extraction b. Cleansing c. Transformation • after the critical steps, loading the results into target system can be carried out either by separate products, or by a single, categories: • code generators • database data replication tools • dynamic transformation engines
  • 16.  Data Warehouse DBSM(integration) • due to the maturity of such products, most relational databases will integrate predictably with other types of software • The reqirements for data warehose RDBMS • Load performance • Load processing • Data quality management • Query perfomance • Terabyte scalability • Mass user scalability • Networked data warehouse • Warehouse administration • Integrated dimensional analysis • Advanced query funtionlity
  • 17.  The importance of managing meta-data(integration) • The integration of meta-data, that is ”data about data” • Meta-data is used for a variety of purposes and the management of it is a critical issue in achieving a fully integrated data warehouse • The major purpose of meta-data is to show the pathway back to where the data began, so that the warehouse administrators know the history of any item in the warehouse • The meta-data associated with data transformation and loading must describe the source data and any changes that were made to the data • The meta-data associated with data management describes the data as it is stored in the warehouse • The meta-data is required by the query manager to generate appropriate queries, also is associated with the user of queries
  • 18. • The major integration issue is how to synchronize the various types of meta-data use throughout the data warehouse. The challenge is to synchronize meta-data between different products from different vendors using different meta-data stores • Two major standards for meta-data and modeling in the areas of data warehousing and component-based development-MDC(Meta Data Coalition) and OMG(Object Management Group)
  • 19.  Administration and Management Tools • a data warehouse requires tools to support the administration and management of such complex enviroment. • for the various types of meta-data and the day-to-day operations of the data warehouse, the administration and management tools must be capable of supporting those tasks: • monitoring data loading from multiple sources • data quality and integrity checks • managing and updating meta-data • monitoring database performance to ensure efficient query response times and resource utilization
  • 20. • auditing data warehouse usage to provide user chargeback information • replicating, subsetting, and distributing data • maintaining effient data storage management • purging data; • archiving and backing-up data • implementing recovery following failure • security management
  • 21.  Data mart • data mart a subset of a data warehouse that supports the requirements of particular department or business function • The characteristics that differentiate data marts and data warehouses include: • a data mart focuses on only the requirements of users associated with one department or business function
  • 22. • data marts do not normally contain detailed operational data, unlike data warehouses • as data marts contain less data compared with data warehouses, data marts are more easily understood and navigated
  • 23. Operational data source1 Warehouse Manager DBMS Operational data source 2 Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining OLAP(online analytical processing) tools Reporting, query,application development, and EIS(executive information system) tools End-user access tools Typical data warehouse adn data mart architecture Operational data store (ODS) Query Manage summarized data(Relational database) Summarized data (Multi-dimension database) Data Mart (First Tier) (Third Tier) (Second Tier) Warehouse Manager
  • 24. Reasons for creating a data mart • To give users access to the data they need to analyze most often • To provide data in a form that matches the collective view of the data by a group of users in a department or business function • To improve end-user response time due to the reduction in the volume of data to be accessed • To provide appropriately structured data as ditated by the requirements of end-user access tools • Normally use less data so tasks such as data cleansing, loading, transformation, and integration are far easier, and hence implementing and setting up a data mart is simpler than establishing a corporate data warehouse
  • 25. • The cost of implementing data marts is normally less than that required to establish a data warehouse • The potential users of a data mart are more clearly defined and can be more easily targeted to obtain support for a data mart project rather than a corporate data warehouse project
  • 26. data marts issues • data mart functionalitythe capabilities of data marts have increased with the growth in their popularity • data mart sizethe performance deteriorates as data marts grow in size, so need to reduce the size of data marts to gain improvements in performance • data mart load performancetwo critical components: end-user response time and data loading performanceto increment DB updating so that only cells affected by the change are updated and not the entire MDDB structure
  • 27. • users’ access to data in multiple martsone approach is to replicate data between different data marts or, alternatively, build virtual data martit is views of several physical data marts or the corporate data warehouse tailored to meet the requirements of specific groups of users • data mart internet/intranet accessit’s products sit between a web server and the data analysis product.Internet/intranet offers users low-cost access to data marts and the data WH using web browsers. • data mart administrationorganization can not easily perform administration of multiple data marts, giving rise to issues such as data mart versioning, data and meta-data consistency and integrity, enterprise-wide security, and performance tuning . Data mart administrative tools are commerciallly available • data mart installationdata marts are becoming increasingly complex to build. Vendors are offering products referred to as ”data mart in a box” that provide a low-cost source of data mart tools

Editor's Notes

  1. ksss