SlideShare a Scribd company logo
Chapter-1
Data Warehousing
What is Data Warehouse?
• Data warehousing provides architectures and
tools for business executives to systematically
organize, understand, and use their data to
make strategic decision.
• A Data Warehouse refers to a database that is
maintained separately from an organization’s
operational databases.
• Data warehouse systems allow for the integration
of a variety of application systems. They
support information processing by providing a solid
platform of consolidated historical data
for analysis.
• According to William H. Inmon, a leading architect in the
construction of data warehouse systems,
“A data warehouse is a subject-oriented, integrated,
time-variant, and nonvolatile collection of data in
support of management’s decision making process”.
This short, but comprehensive definition presents the
major features of a data warehouse.
• The four keywords, subject-oriented, integrated, time-
variant, and non-volatile, distinguish data warehouses
from other data repository systems, such as relational
database systems, transaction processing systems,
and file systems.
Subject-oriented:
• A data warehouse is organized around major subjects,
such as customer, supplier, product and sales.
• Rather than concentrating on the day-to-day
operations and transaction processing of an
organization, a data warehouse focuses on the
modeling and analysis of data for decision
makers.
• Hence, data warehouse typically provide a simple and
concise view around particular subject issues by
excluding data that are not useful in the decision
support process.
• For example, to learn more about your company's
sales data, you can build a warehouse that
concentrates on sales.
• Using this warehouse, you can answer questions
like "Who was our best customer for this item last
year?"
• This ability to define a data warehouse by
subject matter, sales in this case, makes the
data warehouse subject oriented.
Integrated:
• A data warehouse is usually constructed by integrating
multiple heterogeneous sources, such as relational
databases, flat files, and on-line transaction records.
• Data cleaning and data integration techniques are
applied to ensure consistency in naming conventions,
encoding structures, attribute measures, and so on.
Time-variant:
• Data are stored to provide information from a historical
perspective (e.g. past 5-10 years).
• Every key structure in the data warehouse contains,
either implicitly or explicitly, an element of time.
Nonvolatile:
• A data warehouse is always a physically separate
store of data transformed from the application data
found in the operational environment.
• Due to this separation, a data warehouse does not
require transaction processing, recovery, and
concurrency control mechanisms.
• It usually requires only two operations in data
accessing: initial loading of data and access of
data.
• Nonvolatile means that, once entered into the
warehouse, data should not change.
• This is logical because the purpose of a warehouse is
to enable you to analyze what has occurred.
• In summary, a data warehouse is a
semantically consistent data store that serves
as a physical implementation of a decision
support data model and stores the
information on which an enterprise needs to
make strategic decisions.
• A data warehouse is often viewed as an
architecture, constructed by integrating data
from multiple heterogeneous sources to
support structured and/or ad hoc queries,
analytical reporting, and decision making.
• Based on this information, we view data warehousing
as the process of constructing and using data
warehouses.
• The construction of data warehouse requires data
cleaning, data integration and data consolidation.
• The utilization of a data warehouse often necessitates
a collection of decision support technologies.
• This allows “knowledge workers” (e.g. managers,
analysts, and executives) to use the warehouse to
quickly and conveniently obtain an overview of the
data, and to make sound decisions based on
information in the warehouse.
Differences between Operational Database
Systems and Data Warehouses
• The major task of on-line operational database systems is to
perform on-line transaction and query processing. These
systems are called on-line transaction processing (OLTP)
systems. They cover most of the day-to-day operations of an
organization, such as purchasing, inventory, manufacturing,
banking, payroll, registration, and accounting.
• Data warehouse systems, on the other hand, serve users or
knowledge workers in the role of data analysis and decision
making. Such systems can organize and present data in
various formats in order to accommodate the diverse needs of
the different users. These systems are known as on-line
analytical processing (OLAP) systems.
The major distinguishing features between
OLTP and OLAP
• Users and system orientation: An OLTP system is
customer-oriented and is used for transaction and query
processing by clerks, clients, and information technology
professionals. An OLAP system is market-oriented and is
used for data analysis by knowledge workers, including
managers, executives, and analysts.
• Data contents: An OLTP system manages current data that,
typically, are too detailed to be easily used for decision making.
An OLAP system manages large amounts of historical data,
provides facilities for summarization and aggregation, and
stores and manages information at different levels of
granularity. These features make the data easier to use in
informed decision making.
• Database design: An OLTP system usually adopts an entity-
relationship (ER) data model and application-oriented
database design. An OLAP system typically adopts either a
star or snowflake model and a subject-oriented database
design.
• View: An OLTP system focuses mainly on the current data
within an enterprise or department, without referring to
historical data or data in different organizations. In contrast, an
OLAP system often spans(pairs) multiple versions of a
database schema, due to the evolutionary process of an
organization. OLAP systems also deal with information that
originates from different organizations, integrating information
from many data stores.
• Access patterns: The access patterns of an OLTP system
consists mainly of short, atomic transactions. Such a system
requires concurrency control and recovery mechanisms.
However, accesses to OLAP systems are mostly read-only
operations (because most data warehouses store historical
data rather than up-to-date information), although many could
be complex queries.
Feature OLTP OLAP
Characteristic Operational processing Informational processing
Orientation Transaction Analysis
User Clerk, DBA, database professional Knowledge worker (e.g., manager, executive, analyst)
Function Day-to-day operations Decision support, long-term informational requirements
DB design Normalized (3NF), application oriented Star/snowflake, subject-oriented
Data Current, guaranteed up-to-date Historical; accuracy maintained over time
Summarization Primitive, highly detailed Summarized, consolidated
View Detailed, flat relational Summarized, multidimensional
Unit of work Short, simple transaction Complex query
Access Read/write Mostly read
Focus Data in Information out
Operations Index/hash on primary key Lots of scan
Number of records accessed Tens Millions
Number of users Thousands Hundreds
DB Size 1 GB to 100 GB 1 TB to 100 TB
Priority High performance, high availability High flexibility, end-user autonomy
Metric Transaction throughput Query throughput, response time
Why do you need to construct a separate
Data Warehouse?
1. A major reason for such a separation is to help promote the
high performance of both systems.
– Operational database is designed and tuned for known tasks and
workloads.
– Data Warehouse queries are often complex, involve computation of
large data and need special data organization. Processing these queries
on operational databases would substantially degrade the performance
of operational tasks.
2. OLTP supports concurrent processing of multiple transactions.
Locking and logging are required to ensure consistency and
robustness of transactions.
• OLAP query often needs read-only access of data records.(No
roll back) . OLAP requires historical data whereas OLTP do not
typically maintain historical data.

More Related Content

What's hot

Data warehouse
Data warehouseData warehouse
Data warehouse
safaataamsah
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Shruti Dalela
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
A P
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousing
Ahmad Shlool
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousingAhmad Shlool
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
Gaurav Garg
 
Data mining 2 - Data warehouse (cheat sheet - printable)
Data mining 2 - Data warehouse (cheat sheet - printable)Data mining 2 - Data warehouse (cheat sheet - printable)
Data mining 2 - Data warehouse (cheat sheet - printable)
yesheeka
 
Data warehousing
Data warehousingData warehousing
Data warehousingVarun Jain
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
Amit Sarkar
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)
yesheeka
 
Unit 1
Unit 1Unit 1
Unit 1
DrPrabu M
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse
Lesa Cote
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
Er. Nawaraj Bhandari
 
Data warehouse proposal
Data warehouse proposalData warehouse proposal
Data warehouse proposal
Peter Macdonald
 
Data Warehousing and Mining
Data Warehousing and MiningData Warehousing and Mining
Data Warehousing and Mining
ethantelaviv
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
yesheeka
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
mekuanint sefi
 

What's hot (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousing
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousing
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
 
Data mining 2 - Data warehouse (cheat sheet - printable)
Data mining 2 - Data warehouse (cheat sheet - printable)Data mining 2 - Data warehouse (cheat sheet - printable)
Data mining 2 - Data warehouse (cheat sheet - printable)
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)
 
Unit 1
Unit 1Unit 1
Unit 1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Data warehouse proposal
Data warehouse proposalData warehouse proposal
Data warehouse proposal
 
Data Warehousing and Mining
Data Warehousing and MiningData Warehousing and Mining
Data Warehousing and Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 

Similar to data warehousing

DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
GraceJoyMoleroCarwan
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
Anusuya123
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
nikshaikh786
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
vipush1
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
MadhuriNigam1
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
AAKANKSHA JAIN
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
SalehaMariyam
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
Nivetha Durganathan
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
Yogendra Uikey
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
obieefans
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Materialobieefans
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
Sateesh Kumar Sarvasiddi
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
SOMASUNDARAM T
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
Data Management
Data ManagementData Management
Data Management
Mufaddal Nullwala
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 

Similar to data warehousing (20)

DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Material
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Lecture1
Lecture1Lecture1
Lecture1
 
Data Management
Data ManagementData Management
Data Management
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 

Recently uploaded

The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 

Recently uploaded (20)

The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 

data warehousing

  • 2. What is Data Warehouse? • Data warehousing provides architectures and tools for business executives to systematically organize, understand, and use their data to make strategic decision. • A Data Warehouse refers to a database that is maintained separately from an organization’s operational databases. • Data warehouse systems allow for the integration of a variety of application systems. They support information processing by providing a solid platform of consolidated historical data for analysis.
  • 3. • According to William H. Inmon, a leading architect in the construction of data warehouse systems, “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision making process”. This short, but comprehensive definition presents the major features of a data warehouse.
  • 4. • The four keywords, subject-oriented, integrated, time- variant, and non-volatile, distinguish data warehouses from other data repository systems, such as relational database systems, transaction processing systems, and file systems.
  • 5. Subject-oriented: • A data warehouse is organized around major subjects, such as customer, supplier, product and sales. • Rather than concentrating on the day-to-day operations and transaction processing of an organization, a data warehouse focuses on the modeling and analysis of data for decision makers. • Hence, data warehouse typically provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.
  • 6. • For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. • Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" • This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.
  • 7.
  • 8. Integrated: • A data warehouse is usually constructed by integrating multiple heterogeneous sources, such as relational databases, flat files, and on-line transaction records. • Data cleaning and data integration techniques are applied to ensure consistency in naming conventions, encoding structures, attribute measures, and so on.
  • 9.
  • 10. Time-variant: • Data are stored to provide information from a historical perspective (e.g. past 5-10 years). • Every key structure in the data warehouse contains, either implicitly or explicitly, an element of time.
  • 11. Nonvolatile: • A data warehouse is always a physically separate store of data transformed from the application data found in the operational environment. • Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms. • It usually requires only two operations in data accessing: initial loading of data and access of data. • Nonvolatile means that, once entered into the warehouse, data should not change. • This is logical because the purpose of a warehouse is to enable you to analyze what has occurred.
  • 12.
  • 13. • In summary, a data warehouse is a semantically consistent data store that serves as a physical implementation of a decision support data model and stores the information on which an enterprise needs to make strategic decisions. • A data warehouse is often viewed as an architecture, constructed by integrating data from multiple heterogeneous sources to support structured and/or ad hoc queries, analytical reporting, and decision making.
  • 14. • Based on this information, we view data warehousing as the process of constructing and using data warehouses. • The construction of data warehouse requires data cleaning, data integration and data consolidation. • The utilization of a data warehouse often necessitates a collection of decision support technologies. • This allows “knowledge workers” (e.g. managers, analysts, and executives) to use the warehouse to quickly and conveniently obtain an overview of the data, and to make sound decisions based on information in the warehouse.
  • 15. Differences between Operational Database Systems and Data Warehouses • The major task of on-line operational database systems is to perform on-line transaction and query processing. These systems are called on-line transaction processing (OLTP) systems. They cover most of the day-to-day operations of an organization, such as purchasing, inventory, manufacturing, banking, payroll, registration, and accounting. • Data warehouse systems, on the other hand, serve users or knowledge workers in the role of data analysis and decision making. Such systems can organize and present data in various formats in order to accommodate the diverse needs of the different users. These systems are known as on-line analytical processing (OLAP) systems.
  • 16. The major distinguishing features between OLTP and OLAP • Users and system orientation: An OLTP system is customer-oriented and is used for transaction and query processing by clerks, clients, and information technology professionals. An OLAP system is market-oriented and is used for data analysis by knowledge workers, including managers, executives, and analysts. • Data contents: An OLTP system manages current data that, typically, are too detailed to be easily used for decision making. An OLAP system manages large amounts of historical data, provides facilities for summarization and aggregation, and stores and manages information at different levels of granularity. These features make the data easier to use in informed decision making.
  • 17. • Database design: An OLTP system usually adopts an entity- relationship (ER) data model and application-oriented database design. An OLAP system typically adopts either a star or snowflake model and a subject-oriented database design. • View: An OLTP system focuses mainly on the current data within an enterprise or department, without referring to historical data or data in different organizations. In contrast, an OLAP system often spans(pairs) multiple versions of a database schema, due to the evolutionary process of an organization. OLAP systems also deal with information that originates from different organizations, integrating information from many data stores.
  • 18. • Access patterns: The access patterns of an OLTP system consists mainly of short, atomic transactions. Such a system requires concurrency control and recovery mechanisms. However, accesses to OLAP systems are mostly read-only operations (because most data warehouses store historical data rather than up-to-date information), although many could be complex queries.
  • 19. Feature OLTP OLAP Characteristic Operational processing Informational processing Orientation Transaction Analysis User Clerk, DBA, database professional Knowledge worker (e.g., manager, executive, analyst) Function Day-to-day operations Decision support, long-term informational requirements DB design Normalized (3NF), application oriented Star/snowflake, subject-oriented Data Current, guaranteed up-to-date Historical; accuracy maintained over time Summarization Primitive, highly detailed Summarized, consolidated View Detailed, flat relational Summarized, multidimensional Unit of work Short, simple transaction Complex query Access Read/write Mostly read Focus Data in Information out Operations Index/hash on primary key Lots of scan Number of records accessed Tens Millions Number of users Thousands Hundreds DB Size 1 GB to 100 GB 1 TB to 100 TB Priority High performance, high availability High flexibility, end-user autonomy Metric Transaction throughput Query throughput, response time
  • 20. Why do you need to construct a separate Data Warehouse? 1. A major reason for such a separation is to help promote the high performance of both systems. – Operational database is designed and tuned for known tasks and workloads. – Data Warehouse queries are often complex, involve computation of large data and need special data organization. Processing these queries on operational databases would substantially degrade the performance of operational tasks. 2. OLTP supports concurrent processing of multiple transactions. Locking and logging are required to ensure consistency and robustness of transactions. • OLAP query often needs read-only access of data records.(No roll back) . OLAP requires historical data whereas OLTP do not typically maintain historical data.