SlideShare a Scribd company logo
INTRODUCTION TO DATA
WAREHOUSING
BY
QUONTRA SOLUTIONS
IT COURSES ONLINE TRAINING WITH PLACEMENT
SUPPORT
PHONE : +44 (0)20 3734 1498 / 99
EMAIL: INFO@QUONTRASOLUTIONS.CO.UK
WEB: WWW.QUONTRASOLUTIONS.CO.UK
DATA WAREHOUSE
 Maintain historic data
 Analysis to get better understanding of business
 Better Decision making
 Definition: A data warehouse is a
 subject-oriented
 integrated
 time-varying
 non-volatile
collection of data that is used primarily in organizational
decision making.
-- Bill Inmon, Building the Data Warehouse
1996
SUBJECT ORIENTED
• Data warehouse is organized around subjects such as sales,
product, customer.
• It focuses on modeling and analysis of data for decision
makers.
• Excludes data not useful in decision support process.
INTEGRATED
• Data Warehouse is constructed by integrating multiple
heterogeneous sources.
• Data Preprocessing are applied to ensure consistency.
RDBMS
Legacy
System
Data
Warehouse
Flat File
Data Processing
Data Transformation
Data Processing
Data Transformation
NON-VOLATILE
• Mostly, data once recorded will not be updated.
• Data warehouse requires two operations in data accessing
- Incremental loading of data
- Access of data
load access
TIME VARIANT
• Provides information from historical perspective e.g. past 5-
10 years
• Every key structure contains either implicitly or explicitly an
element of time
WHY DATA WAREHOUSE?
Problem Statement:
• ABC Pvt Ltd is a company with branches at USA,
UK,CANADA,INDIA
• The Sales Manager wants quarterly sales report across the
branches.
• Each branch has a separate operational system where sales
transactions are recorded.
WHY DATA WAREHOUSE?
USA
UK
CANADA
INDIA
Sales
Manager
Get quarterly sales figure
for each branch
and manually calculate
sales figure across branches.
What if he need daily sales report across the branches?
WHY DATA WAREHOUSE?
Solution:
• Extract sales information from each database.
• Store the information in a common repository at a single site.
WHY DATA WAREHOUSE?
USA
UK
CANADA
INDIA
Data
Warehouse
Sales
Manager
Query &
Analysis tools
CHARACTERISTICS OF DATA WAREHOUSE
 Relational / Multidimensional database
 Query and Analysis rather than transaction
 Historical data from transactions
 Consolidates Multiple data sources
 Separates query load from transactions
 Mostly non volatile
 Large amount of data in order of TBs
WHEN WE SAY LARGE - WE MEAN IT!
• Terabytes -- 10^12 bytes:
• Petabytes -- 10^15 bytes:
• Exabytes -- 10^18 bytes:
• Zettabytes -- 10^21 bytes:
• Zottabytes -- 10^24 bytes:
Yahoo! – 300 Terabytes and
growing
Geographic Information Systems
National Medical Records
Weather images
Intelligence Agency Videos
OLTP VS DATA WAREHOUSE (OLAP)
OLTP Data Warehouse (OLAP)
Indexes Few Many
Data Normalized Generally De-normalized
Joins Many Some
Derived data and aggregates Rare Common
DATA WAREHOUSE ARCHITECTURE
Flat
Files
ETL
(Extract
Transform
and Load)
Data
Warehouse
Inventory
Data Mart
Data Mining
Analysis
Reporting
Generic
Data Mart
Sales
Data Mart
Operational
System
Operational
System
Flat
Files
ETL
ETL stands for Extract, Transform and Load
 Data is distributed across different sources
– Flat files, Streaming Data, DB Systems, XML, JSON
 Data can be in different format
– CSV, Key Value Pairs
 Different units and representation
– Country: IN or India
– Date: 20 Nov 2010 or 20101020
ETL FUNCTIONS
 Extract
– Collect data from different sources
– Parse data
– Remove unwanted data
 Transform
– Project
– Generate Surrogate keys
– Encode data
– Join data from different sources
– Aggregate
 Load
ETL STEPS
• The first step in ETL process is mapping the data between
source systems and target database.
• The second step is cleansing of source data in staging area.
• The third step is transforming cleansed source data.
• Fourth step is loading into the target system.
 Data before ETL Processing:
 Data after ETL Processing:
ETL GLOSSARY
Mapping:
Defining relationship between source and target objects.
Cleansing:
The process of resolving inconsistencies in source data.
Transformation:
The process of manipulating data. Any manipulation beyond
copying is a transformation. Examples include aggregating, and
integrating data from multiple sources.
Staging Area:
A place where data is processed before entering the warehouse.
DIMENSION
 Categorizes the data. For example - time, location, etc.
 A dimension can have one or more attributes. For example
- day, week and month are attributes of time dimension.
 Role of dimensions in data warehousing.
- Slice and dice
- Filter by dimensions
TYPES OF DIMENSIONS
• Conformed Dimension - A dimension that is shared across fact tables.
• Junk Dimension - A junk dimension is a convenient grouping of flags
and indicators. For example, payment method, shipping method.
• De-generated Dimension - A dimension key, that has no attributes and
hence does not have its own dimension table. For example,
transaction number, invoice number. Value of these dimension is
mostly unique within a fact table.
• Role Playing Dimensions - Role Playing dimension refers to a
dimension that play different roles in fact tables depending on the
context. For example, the Date dimension can be used for the ordered
date, shipment date, and invoice date.
• Slowly Changing Dimensions - Dimensions that have data that
changes slowly, rather than changing on a time-based, regular
schedule.
TYPES OF SLOWLY CHANGING DIMENSION
• Type1 - The Type 1 methodology overwrites old data with new data, and
therefore does not track historical data at all.
• Type 2 - The Type 2 method tracks historical data by creating multiple records
for a given value in dimension table with separate surrogate keys.
• Type 3 - The Type 3 method tracks changes using separate columns. Whereas
Type 2 had unlimited history preservation, Type 3 has limited history
preservation, as it's limited to the number of columns we designate for storing
historical data.
• Type 4 - The Type 4 method is usually referred to as using "history tables",
where one table keeps the current data, and an additional table is used to keep
a record of all changes.
Type 1, 2 and 3 are commonly used.
Some books talks about Type 0 and 6 also.
http://en.wikipedia.org/wiki/Slowly_changing_dimension
FACTS
 Facts are values that can be examined and analyzed.
 For Example - Page Views, Unique Users, Pieces Sold,
Profit.
 Fact and measure are synonymous.
 Types of facts:
– Additive - Measures that can be added across all
dimensions.
– Non Additive - Measures that cannot be added across
all dimensions.
– Semi Additive - Measures that can be added across
few dimensions and not with others.
HOW TO STORE DATA?
Facts and Dimensions:
1. Select the business process to model
2. Declare the grain of the business process
3. Choose the dimensions that apply to each fact table row
4. Identify the numeric facts that will populate each fact table
row
DIMENSION TABLE
 Contains attributes of dimensions e.g. Month is an attribute
of Time dimension.
 Can also have foreign keys to another dimension table
 Usually identified by a unique integer primary key called
surrogate key
FACT TABLE
 Contains Facts
 Foreign keys to dimension tables
 Primary Key: usually composite key of all FKs
TYPES OF SCHEMA USED IN DATA WAREHOUSE
 Star Schema
 Snowflake Schema
 Fact Constellation Schema
STAR SCHEMA
 Multi-dimensional Data
 Dimension and Fact Tables
 A fact table with pointers to Dimension tables
STAR SCHEMA
SNOWFLAKE SCHEMA
 An extension of star schema in which the dimension tables
are partly or fully normalized.
 Dimension table hierarchies broken down into simpler
tables.
SNOWFLAKE SCHEMA
FACT CONSTELLATION SCHEMA
• A fact constellation schema allows dimension tables to be
shared between fact tables.
• This Schema is used mainly for the aggregate fact tables,
OR where we want to split a fact table for better
comprehension.
 For example, a separate fact table for daily, weekly and
monthly reporting requirement.
FACT CONSTELLATION SCHEMA
In this example, the dimensions tables for time, item, and location are
shared between both the sales and shipping fact tables.
OPERATIONS ON DATA WAREHOUSE
 Drill Down
 Roll up
 Slice & Dice
 Pivoting
DRILL DOWN
Time
Product
Category e.g Home Appliances
Sub Category e.g Kitchen Appliances
Product e.g Toaster
ROLL UP
Year
Quarter
Month
Fiscal Year
Fiscal Quarter
Fiscal Month
Fiscal Week
Day
SLICE & DICE
Time
Product
Product = Toaster
Time
PIVOTING
• Also called rotation
• Rotate on an axis
• Interchange Rows and Columns
Time
Product
Region
Product
ADVANTAGES OF DATA WAREHOUSE
• One consistent data store for reporting, forecasting, and
analysis
• Easier and timely access to data
• Scalability
• Trend analysis and detection
• Drill down analysis
DISADVANTAGES OF DATA WAREHOUSE
• Preparation may be time consuming.
• High associated cost
CASE STUDY: WHY DATA WAREHOUSE
• G2G Courier Pvt. Ltd. is an established brand in courier
industry which has its own network in main cities and also
have sub contracted in rural areas across the country to
various partners.
• The President of the company wants to look deep into the
financial health of the company and different performance
aspects.
CHALLENGES
• Apart from G2G’s own transaction system, each partner has
their own system which make the data very heterogeneous.
• Granularity of data in various systems is also different. For
eg: minute accuracy and day accuracy.
• To do analysis on metrics like Revenue and Timely delivery
across various geographical locations and partner, we need
to have a unified system.
DATA WAREHOUSE MODEL
Sales Fact
Region
Product
Product
Category
Time
THANK YOU

More Related Content

What's hot

Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
Databricks
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Mark Rittman
 
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the CostHow to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
AtScale
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
Kent Graziano
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Cloudera, Inc.
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
Amr Awadallah
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
Michael Olschimke
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
Harald Erb
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Patrick Van Renterghem
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
Rob Winters
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
Kent Graziano
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Kent Graziano
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
Kent Graziano
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
shuwutong
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
DATAVERSITY
 
DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016
Sébastien Notebaert
 
NoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value StoreNoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value Store
DATAVERSITY
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 

What's hot (20)

Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the CostHow to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
 
DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016DQS & MDS in SQL Server 2016
DQS & MDS in SQL Server 2016
 
NoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value StoreNoSQL – Beyond the Key-Value Store
NoSQL – Beyond the Key-Value Store
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
 

Similar to Dataware house Introduction By Quontra Solutions

Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClasses
InformaticaTrainingClasses
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Gurpreet Singh Sachdeva
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
Shahed Khalili
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
Nivetha Durganathan
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
Yogendra Uikey
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
tafosepsdfasg
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Panchaleswar Nayak
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
MohammedAmeenUlIslam1
 
Asper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling TopicsAsper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling Topics
Terry Bunio
 
Datawarehouse
DatawarehouseDatawarehouse
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
vipush1
 
Bi overview
Bi overviewBi overview
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
Kiran kumar
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
nikshaikh786
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Terry Bunio
 
3dw
3dw3dw
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
ssuser7fc7eb
 
data warehousing
data warehousingdata warehousing
data warehousing
143sohil
 

Similar to Dataware house Introduction By Quontra Solutions (20)

Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClasses
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Asper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling TopicsAsper database presentation - Data Modeling Topics
Asper database presentation - Data Modeling Topics
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
Bi overview
Bi overviewBi overview
Bi overview
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
 
3dw
3dw3dw
3dw
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 
data warehousing
data warehousingdata warehousing
data warehousing
 

More from Quontra Solutions

Java Constructors with examples - Quontra Solutions
Java Constructors with examples  - Quontra SolutionsJava Constructors with examples  - Quontra Solutions
Java Constructors with examples - Quontra Solutions
Quontra Solutions
 
Oracle-12c Online Training by Quontra Solutions
 Oracle-12c Online Training by Quontra Solutions Oracle-12c Online Training by Quontra Solutions
Oracle-12c Online Training by Quontra Solutions
Quontra Solutions
 
Test Automation Framework Online Training by QuontraSolutions
Test Automation Framework Online Training by QuontraSolutionsTest Automation Framework Online Training by QuontraSolutions
Test Automation Framework Online Training by QuontraSolutions
Quontra Solutions
 
Enterprise java beans
Enterprise java beansEnterprise java beans
Enterprise java beans
Quontra Solutions
 
Automation with Selenium Presented by Quontra Solutions
Automation with Selenium Presented by Quontra SolutionsAutomation with Selenium Presented by Quontra Solutions
Automation with Selenium Presented by Quontra Solutions
Quontra Solutions
 
Automated Software Testing Framework Training by Quontra Solutions
Automated Software Testing Framework Training by Quontra SolutionsAutomated Software Testing Framework Training by Quontra Solutions
Automated Software Testing Framework Training by Quontra Solutions
Quontra Solutions
 
DataMining and OLAP Technology Concepts Presented By Quontra Solutions
DataMining and OLAP Technology Concepts Presented By Quontra SolutionsDataMining and OLAP Technology Concepts Presented By Quontra Solutions
DataMining and OLAP Technology Concepts Presented By Quontra Solutions
Quontra Solutions
 
Network security by quontra solutions uk
Network security by quontra solutions ukNetwork security by quontra solutions uk
Network security by quontra solutions uk
Quontra Solutions
 
Introduction to .net FrameWork by QuontraSolutions
Introduction to .net FrameWork by QuontraSolutionsIntroduction to .net FrameWork by QuontraSolutions
Introduction to .net FrameWork by QuontraSolutions
Quontra Solutions
 
Informatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
Informatica Metadata Exchange Frequently Asked Questions by Quontra SolutionsInformatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
Informatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
Quontra Solutions
 
Informatica metadata exchange frequently asked questions by quontra solutions
Informatica metadata exchange frequently asked questions by quontra solutionsInformatica metadata exchange frequently asked questions by quontra solutions
Informatica metadata exchange frequently asked questions by quontra solutions
Quontra Solutions
 
Selenium overview ppt by quontra solutions
Selenium overview ppt by quontra solutionsSelenium overview ppt by quontra solutions
Selenium overview ppt by quontra solutions
Quontra Solutions
 

More from Quontra Solutions (12)

Java Constructors with examples - Quontra Solutions
Java Constructors with examples  - Quontra SolutionsJava Constructors with examples  - Quontra Solutions
Java Constructors with examples - Quontra Solutions
 
Oracle-12c Online Training by Quontra Solutions
 Oracle-12c Online Training by Quontra Solutions Oracle-12c Online Training by Quontra Solutions
Oracle-12c Online Training by Quontra Solutions
 
Test Automation Framework Online Training by QuontraSolutions
Test Automation Framework Online Training by QuontraSolutionsTest Automation Framework Online Training by QuontraSolutions
Test Automation Framework Online Training by QuontraSolutions
 
Enterprise java beans
Enterprise java beansEnterprise java beans
Enterprise java beans
 
Automation with Selenium Presented by Quontra Solutions
Automation with Selenium Presented by Quontra SolutionsAutomation with Selenium Presented by Quontra Solutions
Automation with Selenium Presented by Quontra Solutions
 
Automated Software Testing Framework Training by Quontra Solutions
Automated Software Testing Framework Training by Quontra SolutionsAutomated Software Testing Framework Training by Quontra Solutions
Automated Software Testing Framework Training by Quontra Solutions
 
DataMining and OLAP Technology Concepts Presented By Quontra Solutions
DataMining and OLAP Technology Concepts Presented By Quontra SolutionsDataMining and OLAP Technology Concepts Presented By Quontra Solutions
DataMining and OLAP Technology Concepts Presented By Quontra Solutions
 
Network security by quontra solutions uk
Network security by quontra solutions ukNetwork security by quontra solutions uk
Network security by quontra solutions uk
 
Introduction to .net FrameWork by QuontraSolutions
Introduction to .net FrameWork by QuontraSolutionsIntroduction to .net FrameWork by QuontraSolutions
Introduction to .net FrameWork by QuontraSolutions
 
Informatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
Informatica Metadata Exchange Frequently Asked Questions by Quontra SolutionsInformatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
Informatica Metadata Exchange Frequently Asked Questions by Quontra Solutions
 
Informatica metadata exchange frequently asked questions by quontra solutions
Informatica metadata exchange frequently asked questions by quontra solutionsInformatica metadata exchange frequently asked questions by quontra solutions
Informatica metadata exchange frequently asked questions by quontra solutions
 
Selenium overview ppt by quontra solutions
Selenium overview ppt by quontra solutionsSelenium overview ppt by quontra solutions
Selenium overview ppt by quontra solutions
 

Recently uploaded

คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
spdendr
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Leena Ghag-Sakpal
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
MJDuyan
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
dot55audits
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
Chevonnese Chevers Whyte, MBA, B.Sc.
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 

Dataware house Introduction By Quontra Solutions

  • 1. INTRODUCTION TO DATA WAREHOUSING BY QUONTRA SOLUTIONS IT COURSES ONLINE TRAINING WITH PLACEMENT SUPPORT PHONE : +44 (0)20 3734 1498 / 99 EMAIL: INFO@QUONTRASOLUTIONS.CO.UK WEB: WWW.QUONTRASOLUTIONS.CO.UK
  • 2. DATA WAREHOUSE  Maintain historic data  Analysis to get better understanding of business  Better Decision making  Definition: A data warehouse is a  subject-oriented  integrated  time-varying  non-volatile collection of data that is used primarily in organizational decision making. -- Bill Inmon, Building the Data Warehouse 1996
  • 3. SUBJECT ORIENTED • Data warehouse is organized around subjects such as sales, product, customer. • It focuses on modeling and analysis of data for decision makers. • Excludes data not useful in decision support process.
  • 4. INTEGRATED • Data Warehouse is constructed by integrating multiple heterogeneous sources. • Data Preprocessing are applied to ensure consistency. RDBMS Legacy System Data Warehouse Flat File Data Processing Data Transformation Data Processing Data Transformation
  • 5. NON-VOLATILE • Mostly, data once recorded will not be updated. • Data warehouse requires two operations in data accessing - Incremental loading of data - Access of data load access
  • 6. TIME VARIANT • Provides information from historical perspective e.g. past 5- 10 years • Every key structure contains either implicitly or explicitly an element of time
  • 7. WHY DATA WAREHOUSE? Problem Statement: • ABC Pvt Ltd is a company with branches at USA, UK,CANADA,INDIA • The Sales Manager wants quarterly sales report across the branches. • Each branch has a separate operational system where sales transactions are recorded.
  • 8. WHY DATA WAREHOUSE? USA UK CANADA INDIA Sales Manager Get quarterly sales figure for each branch and manually calculate sales figure across branches. What if he need daily sales report across the branches?
  • 9. WHY DATA WAREHOUSE? Solution: • Extract sales information from each database. • Store the information in a common repository at a single site.
  • 11. CHARACTERISTICS OF DATA WAREHOUSE  Relational / Multidimensional database  Query and Analysis rather than transaction  Historical data from transactions  Consolidates Multiple data sources  Separates query load from transactions  Mostly non volatile  Large amount of data in order of TBs
  • 12. WHEN WE SAY LARGE - WE MEAN IT! • Terabytes -- 10^12 bytes: • Petabytes -- 10^15 bytes: • Exabytes -- 10^18 bytes: • Zettabytes -- 10^21 bytes: • Zottabytes -- 10^24 bytes: Yahoo! – 300 Terabytes and growing Geographic Information Systems National Medical Records Weather images Intelligence Agency Videos
  • 13. OLTP VS DATA WAREHOUSE (OLAP) OLTP Data Warehouse (OLAP) Indexes Few Many Data Normalized Generally De-normalized Joins Many Some Derived data and aggregates Rare Common
  • 14. DATA WAREHOUSE ARCHITECTURE Flat Files ETL (Extract Transform and Load) Data Warehouse Inventory Data Mart Data Mining Analysis Reporting Generic Data Mart Sales Data Mart Operational System Operational System Flat Files
  • 15. ETL ETL stands for Extract, Transform and Load  Data is distributed across different sources – Flat files, Streaming Data, DB Systems, XML, JSON  Data can be in different format – CSV, Key Value Pairs  Different units and representation – Country: IN or India – Date: 20 Nov 2010 or 20101020
  • 16. ETL FUNCTIONS  Extract – Collect data from different sources – Parse data – Remove unwanted data  Transform – Project – Generate Surrogate keys – Encode data – Join data from different sources – Aggregate  Load
  • 17. ETL STEPS • The first step in ETL process is mapping the data between source systems and target database. • The second step is cleansing of source data in staging area. • The third step is transforming cleansed source data. • Fourth step is loading into the target system.  Data before ETL Processing:  Data after ETL Processing:
  • 18. ETL GLOSSARY Mapping: Defining relationship between source and target objects. Cleansing: The process of resolving inconsistencies in source data. Transformation: The process of manipulating data. Any manipulation beyond copying is a transformation. Examples include aggregating, and integrating data from multiple sources. Staging Area: A place where data is processed before entering the warehouse.
  • 19. DIMENSION  Categorizes the data. For example - time, location, etc.  A dimension can have one or more attributes. For example - day, week and month are attributes of time dimension.  Role of dimensions in data warehousing. - Slice and dice - Filter by dimensions
  • 20. TYPES OF DIMENSIONS • Conformed Dimension - A dimension that is shared across fact tables. • Junk Dimension - A junk dimension is a convenient grouping of flags and indicators. For example, payment method, shipping method. • De-generated Dimension - A dimension key, that has no attributes and hence does not have its own dimension table. For example, transaction number, invoice number. Value of these dimension is mostly unique within a fact table. • Role Playing Dimensions - Role Playing dimension refers to a dimension that play different roles in fact tables depending on the context. For example, the Date dimension can be used for the ordered date, shipment date, and invoice date. • Slowly Changing Dimensions - Dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule.
  • 21. TYPES OF SLOWLY CHANGING DIMENSION • Type1 - The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. • Type 2 - The Type 2 method tracks historical data by creating multiple records for a given value in dimension table with separate surrogate keys. • Type 3 - The Type 3 method tracks changes using separate columns. Whereas Type 2 had unlimited history preservation, Type 3 has limited history preservation, as it's limited to the number of columns we designate for storing historical data. • Type 4 - The Type 4 method is usually referred to as using "history tables", where one table keeps the current data, and an additional table is used to keep a record of all changes. Type 1, 2 and 3 are commonly used. Some books talks about Type 0 and 6 also. http://en.wikipedia.org/wiki/Slowly_changing_dimension
  • 22. FACTS  Facts are values that can be examined and analyzed.  For Example - Page Views, Unique Users, Pieces Sold, Profit.  Fact and measure are synonymous.  Types of facts: – Additive - Measures that can be added across all dimensions. – Non Additive - Measures that cannot be added across all dimensions. – Semi Additive - Measures that can be added across few dimensions and not with others.
  • 23. HOW TO STORE DATA? Facts and Dimensions: 1. Select the business process to model 2. Declare the grain of the business process 3. Choose the dimensions that apply to each fact table row 4. Identify the numeric facts that will populate each fact table row
  • 24. DIMENSION TABLE  Contains attributes of dimensions e.g. Month is an attribute of Time dimension.  Can also have foreign keys to another dimension table  Usually identified by a unique integer primary key called surrogate key
  • 25. FACT TABLE  Contains Facts  Foreign keys to dimension tables  Primary Key: usually composite key of all FKs
  • 26. TYPES OF SCHEMA USED IN DATA WAREHOUSE  Star Schema  Snowflake Schema  Fact Constellation Schema
  • 27. STAR SCHEMA  Multi-dimensional Data  Dimension and Fact Tables  A fact table with pointers to Dimension tables
  • 29. SNOWFLAKE SCHEMA  An extension of star schema in which the dimension tables are partly or fully normalized.  Dimension table hierarchies broken down into simpler tables.
  • 31. FACT CONSTELLATION SCHEMA • A fact constellation schema allows dimension tables to be shared between fact tables. • This Schema is used mainly for the aggregate fact tables, OR where we want to split a fact table for better comprehension.  For example, a separate fact table for daily, weekly and monthly reporting requirement.
  • 32. FACT CONSTELLATION SCHEMA In this example, the dimensions tables for time, item, and location are shared between both the sales and shipping fact tables.
  • 33. OPERATIONS ON DATA WAREHOUSE  Drill Down  Roll up  Slice & Dice  Pivoting
  • 34. DRILL DOWN Time Product Category e.g Home Appliances Sub Category e.g Kitchen Appliances Product e.g Toaster
  • 35. ROLL UP Year Quarter Month Fiscal Year Fiscal Quarter Fiscal Month Fiscal Week Day
  • 37. PIVOTING • Also called rotation • Rotate on an axis • Interchange Rows and Columns Time Product Region Product
  • 38. ADVANTAGES OF DATA WAREHOUSE • One consistent data store for reporting, forecasting, and analysis • Easier and timely access to data • Scalability • Trend analysis and detection • Drill down analysis
  • 39. DISADVANTAGES OF DATA WAREHOUSE • Preparation may be time consuming. • High associated cost
  • 40. CASE STUDY: WHY DATA WAREHOUSE • G2G Courier Pvt. Ltd. is an established brand in courier industry which has its own network in main cities and also have sub contracted in rural areas across the country to various partners. • The President of the company wants to look deep into the financial health of the company and different performance aspects.
  • 41. CHALLENGES • Apart from G2G’s own transaction system, each partner has their own system which make the data very heterogeneous. • Granularity of data in various systems is also different. For eg: minute accuracy and day accuracy. • To do analysis on metrics like Revenue and Timely delivery across various geographical locations and partner, we need to have a unified system.
  • 42. DATA WAREHOUSE MODEL Sales Fact Region Product Product Category Time