SlideShare a Scribd company logo
Data Warehouse Architecture
Dr. G.Jasmine Beulah
Dept. Computer Science,
Kristu Jayanti College, Bengaluru.
Functions of Data Warehouse Tools and
Utilities
The following are the functions of data warehouse tools and utilities −
• Data Extraction − Involves gathering data from multiple
heterogeneous sources.
• Data Cleaning − Involves finding and correcting the errors in data.
• Data Transformation − Involves converting the data from legacy
format to warehouse format.
• Data Loading − Involves sorting, summarizing, consolidating,
checking integrity, and building indices and partitions.
• Refreshing − Involves updating from data sources to warehouse.
Terminologies
Metadata
Metadata is simply defined as data about data.
The data that are used to represent other data is known as metadata.
For example, the index of a book serves as a metadata for the contents
in the book.
In terms of data warehouse, we can define metadata as following −
• Metadata is a road-map to data warehouse.
• Metadata in data warehouse defines the warehouse objects.
• Metadata acts as a directory. This directory helps the decision support
system to locate the contents of a data warehouse.
Metadata Repository
Metadata repository is an integral part of a data warehouse system. It contains the following
metadata −
Business metadata − It contains the data ownership information, business definition, and
changing policies.
Operational metadata − It includes currency of data and data lineage. Currency of data
refers to the data being active, archived, or purged. Lineage of data means history of data
migrated and transformation applied on it.
Data for mapping from operational environment to data warehouse − It metadata
includes source databases and their contents, data extraction, data partition, cleaning,
transformation rules, data refresh and purging rules.
The algorithms for summarization − It includes dimension algorithms, data on granularity,
aggregation, summarizing, etc.
Data Cube
A data cube helps us represent data in multiple dimensions. It is
defined by dimensions and facts.
The dimensions are the entities with respect to which an enterprise
preserves the records.
Example
Suppose a company wants to keep track of sales records with the help of
sales data warehouse with respect to time, item, branch, and location.
These dimensions allow to keep track of monthly sales and at which
branch the items were sold. There is a table associated with each
dimension. This table is known as dimension table. For example, "item"
dimension table may have attributes such as item_name, item_type, and
item_brand.
Data Cube
2-D
view of
Sales
Data
for a
compa
ny with
respect
to time,
item,
3-D view of the sales data with respect to time,
item, and location
Data Mart
• Data marts contain a subset of organization-wide data that is valuable
to specific groups of people in an organization.
• In other words, a data mart contains only those data that is specific to a
particular group.
• For example, the marketing data mart may contain only data related to
items, customers, and sales.
• Data marts are confined to subjects.
Graphical Representation of Data Mart
Multi-dimensional Data Model(MDM)
• A multidimensional model views data in the form of a data-cube.
• A data cube enables data to be modeled and viewed in multiple
dimensions.
• It is defined by dimensions and facts.
The dimensions are the perspectives or entities concerning which an
organization keeps records. For example, a shop may create a sales data
warehouse to keep records of the store's sales for the dimension time,
item, and location. These dimensions allow the shop to keep track of
things, for example, monthly sales of items and the locations at which
the items were sold. Each dimension has a table related to it, called a
dimensional table, which describes the dimension further. For example,
a dimensional table for an item may contain the attributes item_name,
brand, and type.
Multi-dimensional Data Model(MDM)
• A multidimensional data model is organized around a central theme,
for example, sales. This theme is represented by a fact table. Facts are
numerical measures. The fact table contains the names of the facts or
measures of the related dimensional tables.
Multi-dimensional Data Model(MDM)
• In this 2D representation, the sales for Delhi are shown for the time
dimension (organized in quarters) and the item dimension (classified
according to the types of an item sold).
• The fact or measure displayed in rupee_sold (in thousands).
View the sales data with a third dimension
Multi-dimensional Data Model(MDM)
Same data in the form of a 3D data cube
Schemas for Multidimensional Data Model are:-
Star Schema
Snowflakes Schema
Fact Constellations Schema
Star Schema
• A star schema is the elementary form of a dimensional model, in
which data are organized into facts and dimensions.
• A fact is an event that is counted or measured, such as a sale or log in.
A dimension includes reference data about the fact, such as date, item,
or customer.
• A star schema is a relational schema where a relational schema whose
design represents a multidimensional data model.
• The star schema is the explicit data warehouse schema.
• It is known as star schema because the entity-relationship diagram of
this schemas simulates a star, with points, diverge from a central table.
The center of the schema consists of a large fact table, and the points
of the star are the dimension tables.
Star Schema
Fact Tables
A fact table has two types of columns: one column of foreign keys (pointing to the dimension tables) and
other of numeric values.
Dimension Tables
Dimension table is generally small in size as compared to a fact table.The primary key of a dimension
table is a foreign key in a fact table.
Example of Dimension Tables are:-
Time dimension table
Product dimension table
Employee dimension table
Geography dimension table
The main characteristics of star schema are that it is easy to understand and small number of tables can
join.
Fact Tables
• A table in a star schema which contains facts and connected to
dimensions.
• A fact table has two types of columns: those that include fact and those
that are foreign keys to the dimension table.
• The primary key of the fact tables is generally a composite key that is
made up of all of its foreign keys.
• A fact table might involve either detail level fact or fact that have been
aggregated (fact tables that include aggregated fact are often instead
called summary tables).
• A fact table generally contains facts with the same level of
aggregation.
Dimension Tables
• A dimension is an architecture usually composed of one or more
hierarchies that categorize data.
• If a dimension has not got hierarchies and levels, it is called a flat
dimension or list.
• The primary keys of each of the dimensions table are part of the
composite primary keys of the fact table.
• Dimensional attributes help to define the dimensional value. They are
generally descriptive, textual values.
• Dimensional tables are usually small in size than fact table.
• Fact tables store data about sales while dimension tables data about the
geographic region (markets, cities), clients, products, times, channels.
Snowflake Schemas for Multidimensional Model
The snowflake schema is a more complex than star schema because dimension tables of the
snowflake are normalized.
The snowflake schema is represented by centralized fact table which is connected to
multiple dimension table and this dimension table can be normalized into additional
dimension tables.
The major difference between the snowflake and star schema models is that the
dimension tables of the snowflake model are normalized to reduce redundancies.
Fact constellation Schemas for Multidimensional Modal
A fact constellation can have multiple fact tables that share many dimension tables. This type
of schema can be viewed as a collection of stars, Snowflake and hence is called a galaxy
schema or a fact constellation.
The main disadvantage of fact constellation schemas is its more complicated design.
This schema defines two fact tables, sales, and shipping. Sales are treated along four
dimensions, namely, time, item, branch, and location
Data Warehouse Architecture
• A data warehouse architecture is a method of defining the overall
architecture of data communication processing and presentation that
exist for end-clients computing within the enterprise.
• Production applications such as payroll accounts payable product
purchasing and inventory control are designed for online transaction
processing (OLTP).
• Such applications gather detailed data from day to day operations.

More Related Content

Similar to Data Warehouse_Architecture.pptx

dataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdfdataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdf
AnilGupta681764
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
moni sindhu
 
Data Warehouse Models and Operators.ppt
Data  Warehouse Models and Operators.pptData  Warehouse Models and Operators.ppt
Data Warehouse Models and Operators.ppt
gosavi609
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
jainyshah20
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Allen Woods
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
yesheeka
 
Business Intelligence: A Review
Business Intelligence: A ReviewBusiness Intelligence: A Review
Business Intelligence: A Review
Fortune Institute of International Business
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentals
Amit Sharma
 
Cs1011 dw-dm-1
Cs1011 dw-dm-1Cs1011 dw-dm-1
Cs1011 dw-dm-1
Aarti Goyal
 
Data warehouse
Data warehouseData warehouse
Data warehouse
_123_
 
Dimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.pptDimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.ppt
nishant523869
 
Dimensional data model
Dimensional data modelDimensional data model
Dimensional data model
Vnktp1
 
(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
MobeenMasoudi
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
Nivetha Durganathan
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
Ashish Chandwani
 
Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClasses
InformaticaTrainingClasses
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Gurpreet Singh Sachdeva
 
Data warehouse
Data warehouseData warehouse
Data warehouse
safaataamsah
 

Similar to Data Warehouse_Architecture.pptx (20)

dataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdfdataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdf
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Data Warehouse Models and Operators.ppt
Data  Warehouse Models and Operators.pptData  Warehouse Models and Operators.ppt
Data Warehouse Models and Operators.ppt
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
Business Intelligence: A Review
Business Intelligence: A ReviewBusiness Intelligence: A Review
Business Intelligence: A Review
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentals
 
Cs1011 dw-dm-1
Cs1011 dw-dm-1Cs1011 dw-dm-1
Cs1011 dw-dm-1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.pptDimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.ppt
 
Dimensional data model
Dimensional data modelDimensional data model
Dimensional data model
 
(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClasses
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

More from Dr. Jasmine Beulah Gnanadurai

DMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptxDMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptx
Dr. Jasmine Beulah Gnanadurai
 
Stacks.pptx
Stacks.pptxStacks.pptx
Quick Sort.pptx
Quick Sort.pptxQuick Sort.pptx
KBS Architecture.pptx
KBS Architecture.pptxKBS Architecture.pptx
KBS Architecture.pptx
Dr. Jasmine Beulah Gnanadurai
 
Knowledge Representation in AI.pptx
Knowledge Representation in AI.pptxKnowledge Representation in AI.pptx
Knowledge Representation in AI.pptx
Dr. Jasmine Beulah Gnanadurai
 
File allocation methods (1)
File allocation methods (1)File allocation methods (1)
File allocation methods (1)
Dr. Jasmine Beulah Gnanadurai
 
Segmentation in operating systems
Segmentation in operating systemsSegmentation in operating systems
Segmentation in operating systems
Dr. Jasmine Beulah Gnanadurai
 
Mem mgt
Mem mgtMem mgt
Decision tree
Decision treeDecision tree
Association rules apriori algorithm
Association rules   apriori algorithmAssociation rules   apriori algorithm
Association rules apriori algorithm
Dr. Jasmine Beulah Gnanadurai
 
Big data architecture
Big data architectureBig data architecture
Big data architecture
Dr. Jasmine Beulah Gnanadurai
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
Dr. Jasmine Beulah Gnanadurai
 
Aritificial intelligence
Aritificial intelligenceAritificial intelligence
Aritificial intelligence
Dr. Jasmine Beulah Gnanadurai
 
Java threads
Java threadsJava threads
Java Applets
Java AppletsJava Applets
Stacks and Queue - Data Structures
Stacks and Queue - Data StructuresStacks and Queue - Data Structures
Stacks and Queue - Data Structures
Dr. Jasmine Beulah Gnanadurai
 
JavaScript Functions
JavaScript FunctionsJavaScript Functions
JavaScript Functions
Dr. Jasmine Beulah Gnanadurai
 
JavaScript Operators
JavaScript OperatorsJavaScript Operators
JavaScript Operators
Dr. Jasmine Beulah Gnanadurai
 
Css Text Formatting
Css Text FormattingCss Text Formatting
Css Text Formatting
Dr. Jasmine Beulah Gnanadurai
 
CSS - Cascading Style Sheet
CSS - Cascading Style SheetCSS - Cascading Style Sheet
CSS - Cascading Style Sheet
Dr. Jasmine Beulah Gnanadurai
 

More from Dr. Jasmine Beulah Gnanadurai (20)

DMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptxDMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptx
 
Stacks.pptx
Stacks.pptxStacks.pptx
Stacks.pptx
 
Quick Sort.pptx
Quick Sort.pptxQuick Sort.pptx
Quick Sort.pptx
 
KBS Architecture.pptx
KBS Architecture.pptxKBS Architecture.pptx
KBS Architecture.pptx
 
Knowledge Representation in AI.pptx
Knowledge Representation in AI.pptxKnowledge Representation in AI.pptx
Knowledge Representation in AI.pptx
 
File allocation methods (1)
File allocation methods (1)File allocation methods (1)
File allocation methods (1)
 
Segmentation in operating systems
Segmentation in operating systemsSegmentation in operating systems
Segmentation in operating systems
 
Mem mgt
Mem mgtMem mgt
Mem mgt
 
Decision tree
Decision treeDecision tree
Decision tree
 
Association rules apriori algorithm
Association rules   apriori algorithmAssociation rules   apriori algorithm
Association rules apriori algorithm
 
Big data architecture
Big data architectureBig data architecture
Big data architecture
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
 
Aritificial intelligence
Aritificial intelligenceAritificial intelligence
Aritificial intelligence
 
Java threads
Java threadsJava threads
Java threads
 
Java Applets
Java AppletsJava Applets
Java Applets
 
Stacks and Queue - Data Structures
Stacks and Queue - Data StructuresStacks and Queue - Data Structures
Stacks and Queue - Data Structures
 
JavaScript Functions
JavaScript FunctionsJavaScript Functions
JavaScript Functions
 
JavaScript Operators
JavaScript OperatorsJavaScript Operators
JavaScript Operators
 
Css Text Formatting
Css Text FormattingCss Text Formatting
Css Text Formatting
 
CSS - Cascading Style Sheet
CSS - Cascading Style SheetCSS - Cascading Style Sheet
CSS - Cascading Style Sheet
 

Recently uploaded

คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 

Recently uploaded (20)

คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 

Data Warehouse_Architecture.pptx

  • 1. Data Warehouse Architecture Dr. G.Jasmine Beulah Dept. Computer Science, Kristu Jayanti College, Bengaluru.
  • 2. Functions of Data Warehouse Tools and Utilities The following are the functions of data warehouse tools and utilities − • Data Extraction − Involves gathering data from multiple heterogeneous sources. • Data Cleaning − Involves finding and correcting the errors in data. • Data Transformation − Involves converting the data from legacy format to warehouse format. • Data Loading − Involves sorting, summarizing, consolidating, checking integrity, and building indices and partitions. • Refreshing − Involves updating from data sources to warehouse.
  • 3. Terminologies Metadata Metadata is simply defined as data about data. The data that are used to represent other data is known as metadata. For example, the index of a book serves as a metadata for the contents in the book. In terms of data warehouse, we can define metadata as following − • Metadata is a road-map to data warehouse. • Metadata in data warehouse defines the warehouse objects. • Metadata acts as a directory. This directory helps the decision support system to locate the contents of a data warehouse.
  • 4. Metadata Repository Metadata repository is an integral part of a data warehouse system. It contains the following metadata − Business metadata − It contains the data ownership information, business definition, and changing policies. Operational metadata − It includes currency of data and data lineage. Currency of data refers to the data being active, archived, or purged. Lineage of data means history of data migrated and transformation applied on it. Data for mapping from operational environment to data warehouse − It metadata includes source databases and their contents, data extraction, data partition, cleaning, transformation rules, data refresh and purging rules. The algorithms for summarization − It includes dimension algorithms, data on granularity, aggregation, summarizing, etc.
  • 5. Data Cube A data cube helps us represent data in multiple dimensions. It is defined by dimensions and facts. The dimensions are the entities with respect to which an enterprise preserves the records. Example Suppose a company wants to keep track of sales records with the help of sales data warehouse with respect to time, item, branch, and location. These dimensions allow to keep track of monthly sales and at which branch the items were sold. There is a table associated with each dimension. This table is known as dimension table. For example, "item" dimension table may have attributes such as item_name, item_type, and item_brand.
  • 6. Data Cube 2-D view of Sales Data for a compa ny with respect to time, item, 3-D view of the sales data with respect to time, item, and location
  • 7. Data Mart • Data marts contain a subset of organization-wide data that is valuable to specific groups of people in an organization. • In other words, a data mart contains only those data that is specific to a particular group. • For example, the marketing data mart may contain only data related to items, customers, and sales. • Data marts are confined to subjects.
  • 9. Multi-dimensional Data Model(MDM) • A multidimensional model views data in the form of a data-cube. • A data cube enables data to be modeled and viewed in multiple dimensions. • It is defined by dimensions and facts. The dimensions are the perspectives or entities concerning which an organization keeps records. For example, a shop may create a sales data warehouse to keep records of the store's sales for the dimension time, item, and location. These dimensions allow the shop to keep track of things, for example, monthly sales of items and the locations at which the items were sold. Each dimension has a table related to it, called a dimensional table, which describes the dimension further. For example, a dimensional table for an item may contain the attributes item_name, brand, and type.
  • 10. Multi-dimensional Data Model(MDM) • A multidimensional data model is organized around a central theme, for example, sales. This theme is represented by a fact table. Facts are numerical measures. The fact table contains the names of the facts or measures of the related dimensional tables.
  • 11. Multi-dimensional Data Model(MDM) • In this 2D representation, the sales for Delhi are shown for the time dimension (organized in quarters) and the item dimension (classified according to the types of an item sold). • The fact or measure displayed in rupee_sold (in thousands). View the sales data with a third dimension
  • 12. Multi-dimensional Data Model(MDM) Same data in the form of a 3D data cube
  • 13. Schemas for Multidimensional Data Model are:- Star Schema Snowflakes Schema Fact Constellations Schema
  • 14. Star Schema • A star schema is the elementary form of a dimensional model, in which data are organized into facts and dimensions. • A fact is an event that is counted or measured, such as a sale or log in. A dimension includes reference data about the fact, such as date, item, or customer. • A star schema is a relational schema where a relational schema whose design represents a multidimensional data model. • The star schema is the explicit data warehouse schema. • It is known as star schema because the entity-relationship diagram of this schemas simulates a star, with points, diverge from a central table. The center of the schema consists of a large fact table, and the points of the star are the dimension tables.
  • 16. Fact Tables A fact table has two types of columns: one column of foreign keys (pointing to the dimension tables) and other of numeric values. Dimension Tables Dimension table is generally small in size as compared to a fact table.The primary key of a dimension table is a foreign key in a fact table. Example of Dimension Tables are:- Time dimension table Product dimension table Employee dimension table Geography dimension table The main characteristics of star schema are that it is easy to understand and small number of tables can join.
  • 17. Fact Tables • A table in a star schema which contains facts and connected to dimensions. • A fact table has two types of columns: those that include fact and those that are foreign keys to the dimension table. • The primary key of the fact tables is generally a composite key that is made up of all of its foreign keys. • A fact table might involve either detail level fact or fact that have been aggregated (fact tables that include aggregated fact are often instead called summary tables). • A fact table generally contains facts with the same level of aggregation.
  • 18. Dimension Tables • A dimension is an architecture usually composed of one or more hierarchies that categorize data. • If a dimension has not got hierarchies and levels, it is called a flat dimension or list. • The primary keys of each of the dimensions table are part of the composite primary keys of the fact table. • Dimensional attributes help to define the dimensional value. They are generally descriptive, textual values. • Dimensional tables are usually small in size than fact table. • Fact tables store data about sales while dimension tables data about the geographic region (markets, cities), clients, products, times, channels.
  • 19. Snowflake Schemas for Multidimensional Model The snowflake schema is a more complex than star schema because dimension tables of the snowflake are normalized. The snowflake schema is represented by centralized fact table which is connected to multiple dimension table and this dimension table can be normalized into additional dimension tables. The major difference between the snowflake and star schema models is that the dimension tables of the snowflake model are normalized to reduce redundancies.
  • 20. Fact constellation Schemas for Multidimensional Modal A fact constellation can have multiple fact tables that share many dimension tables. This type of schema can be viewed as a collection of stars, Snowflake and hence is called a galaxy schema or a fact constellation. The main disadvantage of fact constellation schemas is its more complicated design. This schema defines two fact tables, sales, and shipping. Sales are treated along four dimensions, namely, time, item, branch, and location
  • 21. Data Warehouse Architecture • A data warehouse architecture is a method of defining the overall architecture of data communication processing and presentation that exist for end-clients computing within the enterprise. • Production applications such as payroll accounts payable product purchasing and inventory control are designed for online transaction processing (OLTP). • Such applications gather detailed data from day to day operations.