SlideShare a Scribd company logo
DATA WARE HOUSING-
BUILDING THE BLOCKS:
design AND architecture
WITH AARONE ATUHE
2020
 80% of this lecture is based on Ponniah’s “Datawarehousing fundamentals for
IT professionals”
 And
 A data ware house tool kit by Ralph Kimbal and Maggy Rose, A complete guide
to dimensional modelling
Information Systems: PROFILE AND ROLE
 Information systems are rooted in the relationship between information,
decision and control
 An IS should collect and classify the information, by means of integrated and
suitable procedures, in order to produce in time and at the right levels the
synthesis to be used to support the decisional process, as well as to
administrate and globally control the enterprise activity
Information as a resource
 Information is an increasing value resource, required from
managers to schedule and monitor effectively the
enterprise activities.
 Information is the first matter which is transformed by
information systems like unfinished products are
transformed by manufacturing systems
Value of information
 Information is an enterprise resource like capital,
first matters, plants and people; thus, it has a
cost.
 Hence, understanding the value of information is
important
DW and DSS
 How are DWs DSS?
DW DESIGN COMPONENTS: Granularity
 is the extent to which a system is broken down into small parts,
either the system itself or its description or observation. It is the
extent to which a larger entity is subdivided.
 For example, a yard broken into inches has finer granularity than a
yard broken into feet.
Data granularity
 The granularity of data refers to the size in which data fields are sub-divided. For example, a
postal address can be recorded, with coarse granularity, as a single field:
 address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA or
 with fine granularity, as multiple fields:
 street address = 200 2nd Ave. South #358
 city = St. Petersburg
 postal code = FL 33701-4313
 country = USA
or even finer granularity:
 street = 2nd Ave. South
 address number = 200
 suite/apartment number = #358
 city = St. Petersburg
 state = FL
 postal-code = 33701
 postal-code-add-on = 4313
 country = USA
Data Granularity in DW
 In an operational system, data is usually kept at the lowest level of detail.
 In a point-of-sale system for a grocery store, the units of sale are captured and stored at the
level of units of a product per transaction at the check-out counter.
 In an order entry system, the quantity ordered is captured and stored at the level of units of
a product per order received from the customer. Whenever you need summary data, you add
up the individual transactions.
 If you are looking for units of a product ordered this month, you read all the orders entered
for the entire month for that product and add up.
 Data granularity in a data warehouse refers to the level of detail. The
lower the level of detail, the finer is the data granularity. Of course, if you
want to keep data in the lowest level of detail, you have to store a lot of
data in the data warehouse
 In a data warehouse, therefore, you find it efficient to keep data
summarized at different levels. Depending on the query, you can then go to
the particular level of detail and satisfy the query.
Figure below shows examples of data granularity
in a typical data warehouse.
WHAT ARE OUR CONCERNS IN DW DESISGN?
 Before deciding to build a data warehouse for your organization, you need to
ask the following basic and fundamental questions and address the relevant
issues:
 Top-down or bottom-up approach?
 Enterprise-wide or departmental?
 Which first—data warehouse or data mart?
 Build pilot or go with a full-fledged implementation?
 Dependent or independent data marts?
 Should you build a large data warehouse and then let that
repository feed data into local, departmental data marts?
 On the other hand, should you build individual local data marts,
and combine them to form your overall data warehouse?
 Should these local data marts be independent of one another?
Or should they be dependent on the overall data warehouse for
data feed?
THIS MEANS WE NEED TO KNOW MORE ABOUT DW AND DATA MARTS
Dw and dm: How Are They Different?
 Inmon stated, “The single most important issue facing the IT manager this
year is whether to build the data warehouse first or the data mart first.”
 Here are the two different basic approaches:
 (1) overall data warehouse feeding dependent data marts, and
 (2) several departmental or local data marts combining into a data
warehouse.
 So, which approach is best in your case, the top-down or the bottom-up
approach? Let us examine these two approaches carefully.
 In the first approach, you extract data from the operational systems; you
then transform, clean, integrate, and keep the data in the datawarehouse.
Top-Down Approach
 In this approach the data in the data warehouse is stored at the
lowest level of granularity based on a normalized data model.
 This is the big-picture approach in which you build the overall, big, enterprise-wide data
warehouse. Here you do not have a collection of fragmented islands of information. The data
warehouse is large and integrated.
 This approach, however, would take longer to build and has a high risk of failure.
 If you do not have experienced professionals on your team, this approach could be
hazardous.
Bottom-Up Approach
 Ralph Kimball, another leading author and expert practitioner in data warehousing, is a
proponent of the approach that has come to be known as the bottom-up approach.
 Kimball (1996) envisions the corporate data warehouse as a collection of conformed data
marts. The key consideration is the conforming of the dimensions among the separate data
marts.
 In this approach data marts are created first to provide analytical and reporting capabilities
for specific business subjects based on the dimensional data model.
 Data marts contain data at the lowest level of granularity and also as summaries depending on
the needs for analysis. These data marts are joined or “unioned” together by conforming the
dimensions
 In this bottom-up approach, you build your departmental data marts one by one. You would set
a priority scheme to determine which data marts you must build first. The most severe
drawback of this approach is data fragmentation. Each independent data mart will be blind to
the overall requirements of the entire organization.
METADATA IN THE DATA WAREHOUSE
 Think of metadata as the Yellow Pages of your town. Do you need information about the
stores in your town, where they are, what their names are, and what products they
specialize in? Go to the Yellow Pages.
 The Yellow Pages is a directory with data about the institutions in your town. Almost in
the same manner, the metadata component serves as a directory of the contents of your
data warehouse.
Types of Metadata
 Metadata in a data warehouse fall into three major
categories:
 Operational metadata
 Extraction and transformation metadata
 End-user metadata
Operational Metadata
 the operational metadata is used to explain how the data was created or transformed
•Whether the job run failed or had warnings
•Which database tables or files were read from, written to, or referenced
•How many rows were read, written to, or referenced
•When the job started and finished
•Which stages and links were used
Extraction and Transformation Metadata
 Extraction and transformation metadata contain data about the
extraction of data from the source systems, namely, the extraction
frequencies, extraction methods, and business rules for the data
extraction.
 Also, this category of metadata contains information about all the data
transformations that take place in the data staging area.
End-User Metadata
 The end-user metadata is the navigational map of the data warehouse.
 It enables the end-users to find information from the data warehouse.
 The end-user metadata allows the end-users to use their own business terminology and
look for information in those ways in which they normally think of the business.
Why is metadata especially important in
a data warehouse?
 First, it acts as the glue that connects all parts of the data warehouse.
 Next, it provides information about the contents and structures to the
developers.
 Finally, it opens the door to the end-users and makes the contents
recognizable in their own terms.
FACT TABLES. WHAT ARE THEY?
 In data warehousing, a Fact table consists of the measurements, metrics or facts of a
business process.
 It is located at the center of a star schema or a snowflake schema surrounded by
dimension tables.
 The primary key of a fact table is usually a composite key that is made up of all of its
foreign keys.
 Fact tables contain the content of the data warehouse and store different types of
measures like additive, non additive, and semi additive measures.
Fact tables CONTINUED…
 A fact table is the primary table in a dimensional model where the numerical
performance measurements of the business are stored, We use the term fact to
represent a business measure.
 We can imagine standing in the marketplace watching products being sold and writing
down the quantity sold and dollar sales amount each day for each product in each store
 .
 A measurement is taken at the intersection of all the dimensions (day,
product, and store). This list of dimensions defines the grain of the fact table
and tells us what the scope of the measurement is.
 The most useful facts are numeric and additive, such as dollar sales amount
illustration
Dimension Tables
 Dimension tables are integral companions to a fact table. The dimension tables contain the
textual descriptors of the business.
 In a well-designed dimensional model, dimension tables have many columns or attributes.
These attributes describe the rows in the dimension table,
 Each dimension is defined by its single primary key, designated by the PK notation which
serves as the basis for referential integrity with any given fact table to which it is joined.
 Dimension attributes serve as the primary source of query constraints, groupings, and report
labels. In a query or report request, attributes are identified as the by words.
 For example, when a user states that he or she wants to see dollar sales by week by brand,
week and brand must be available as dimension attributes.
 Dimension table attributes play a vital role in the data warehouse. Since they are the source
of virtually all interesting constraints and report labels, they are key to making the data
warehouse usable and understandable
Sample dimension table.
Bringing Together Facts and
Dimensions: Fact and dimension tables in a
dimensional model
The Process of DataWarehouse Design
 A data warehouse can be built using a top-down approach (Starts with overall design
and planning), a bottom-up approach (Starts with experiments and prototypes (rapid)),
or a combination of both.
 From the software engineering point of view, the design and construction of a data
warehouse may consist of the following steps
 : planning,
 requirements study,
 problem analysis,
 warehouse design,
 data integration and testing, and
 finally deployment of the data warehouse.
 Large software systems can be developed using two methodologies: the waterfall method or
the spiral method.
Data ware house architectures:
Three-Tier Data Warehouse Architecture
 Generally the data warehouses adopt the three-tier architecture. Following are the
three tiers of data warehouse architecture.
 Bottom Tier - The bottom tier of the architecture is the data warehouse database
server. It is the relational database system. We use the back end tools and utilities to
feed data into bottom tier.
 Middle Tier - In the middle tier we have OLAP Server. the OLAP Server can be implemented in either of the
following ways.
 By relational OLAP (ROLAP), which is an extended relational database management system. The ROLAP
maps the operations on multidimensional data to standard relational operations,
 By Multidimensional OLAP (MOLAP) model, which directly implements multidimensional data and
operations.
 Top-Tier - This tier is the front-end client layer. This layer hold the query tools and reporting
tool, analysis tools and data mining tools.
7/20/2022
MIS7206-1
Simple Data warehouse Architecture
Registration
System
Exam results
System
Fees payment System
Extract,
Transform,
Load
(ETL)
F
a
c
t
s
Source systems Information
Data Warehouse
Data
Marts
Knowledge
Knowledge
management
Statistical
analysis
Data mining
Unstructured information
Performance
management
Staging area
7/20/2022
MIS7202-1
Data warehouse Architecture explained
 Source systems: these refer to the different operational systems where data
is extracted from
 ETL: this refers to the software tool used in extraction transforming and
loading of data This could be from source direct to data warehouse or to a
staging area data base where data cleaning can be done.
7/20/2022
MIS7202-1
 In the data warehouse the data is arranged in a
dimensional way with facts and dimensions, Depending on
the model used the data warehouse could then be split in
data marts to handle specific user needs.
 A data mart is a sub section of a data warehouse
customized for one business area
Multi-tired architecture
Three Data Warehouse models
 Enterprise warehouse
 Data Mart
 Virtual warehouse (make very brief notes)
Up next-
 DW SDLC
 DWOlap
 Dimensional modelling and schemas
 NEXTGENERATION DW

More Related Content

Similar to BI_LECTURE_4-2021.pptx

Data warehouse
Data warehouseData warehouse
Data warehouse
RajThakuri
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Juhi Mahajan
 
Databases
DatabasesDatabases
Databases
UMaine
 
Databases
DatabasesDatabases
Databases
UMaine
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
JinElias52
 
Data warehouse
Data warehouseData warehouse
Data warehouse
_123_
 
Lesson 2.docx
Lesson 2.docxLesson 2.docx
Lesson 2.docx
calf_ville86
 
Star schema
Star schemaStar schema
Data warehousing
Data warehousingData warehousing
Data warehousing
1810dubeybhavna
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
DougSchoemaker
 
Data Management
Data ManagementData Management
Data Management
Mufaddal Nullwala
 
ETL QA
ETL QAETL QA
ETL QA
dillip kar
 
DW 101
DW 101DW 101
DW 101
jeffd00
 
Unit 1
Unit 1Unit 1
Unit 1
DrPrabu M
 
Data warehouse
Data warehouseData warehouse
Data warehouse
safaataamsah
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
Siwawong Wuttipongprasert
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
Muhammad Ahmad
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
jeffd00
 

Similar to BI_LECTURE_4-2021.pptx (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Databases
DatabasesDatabases
Databases
 
Databases
DatabasesDatabases
Databases
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Lesson 2.docx
Lesson 2.docxLesson 2.docx
Lesson 2.docx
 
Star schema
Star schemaStar schema
Star schema
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
 
Data Management
Data ManagementData Management
Data Management
 
ETL QA
ETL QAETL QA
ETL QA
 
DW 101
DW 101DW 101
DW 101
 
Unit 1
Unit 1Unit 1
Unit 1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
 

Recently uploaded

Hypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in itHypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in it
Vishal kr Thakur
 
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
rightmanforbloodline
 
Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.
Dinesh Chauhan
 
Can Allopathy and Homeopathy Be Used Together in India.pdf
Can Allopathy and Homeopathy Be Used Together in India.pdfCan Allopathy and Homeopathy Be Used Together in India.pdf
Can Allopathy and Homeopathy Be Used Together in India.pdf
Dharma Homoeopathy
 
Luxurious Spa In Ajman Chandrima Massage Center
Luxurious Spa In Ajman Chandrima Massage CenterLuxurious Spa In Ajman Chandrima Massage Center
Luxurious Spa In Ajman Chandrima Massage Center
Chandrima Spa Ajman
 
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
rightmanforbloodline
 
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdfHow Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
Dharma Homoeopathy
 
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to CareLGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
VITASAuthor
 
Top Rated Massage Center In Ajman Chandrima Spa
Top Rated Massage Center In Ajman Chandrima SpaTop Rated Massage Center In Ajman Chandrima Spa
Top Rated Massage Center In Ajman Chandrima Spa
Chandrima Spa Ajman
 
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
khvdq584
 
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
bkling
 
Gemma Wean- Nutritional solution for Artemia
Gemma Wean- Nutritional solution for ArtemiaGemma Wean- Nutritional solution for Artemia
Gemma Wean- Nutritional solution for Artemia
smuskaan0008
 
Letter to MREC - application to conduct study
Letter to MREC - application to conduct studyLetter to MREC - application to conduct study
Letter to MREC - application to conduct study
Azreen Aj
 
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSONNEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
SHAMIN EABENSON
 
Trauma Outpatient Center .
Trauma Outpatient Center                       .Trauma Outpatient Center                       .
Trauma Outpatient Center .
TraumaOutpatientCent
 
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COMHUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
priyabhojwani1200
 
Vicarious movements or trick movements_AB.pdf
Vicarious movements or trick movements_AB.pdfVicarious movements or trick movements_AB.pdf
Vicarious movements or trick movements_AB.pdf
Arunima620542
 
Anxiety, Trauma and Stressor Related Disorder.pptx
Anxiety, Trauma and Stressor Related Disorder.pptxAnxiety, Trauma and Stressor Related Disorder.pptx
Anxiety, Trauma and Stressor Related Disorder.pptx
Sagunlohala1
 
CCSN_June_06 2024_jones. Cancer Rehabpptx
CCSN_June_06 2024_jones. Cancer RehabpptxCCSN_June_06 2024_jones. Cancer Rehabpptx
CCSN_June_06 2024_jones. Cancer Rehabpptx
Canadian Cancer Survivor Network
 
Unlocking the Secrets to Safe Patient Handling.pdf
Unlocking the Secrets to Safe Patient Handling.pdfUnlocking the Secrets to Safe Patient Handling.pdf
Unlocking the Secrets to Safe Patient Handling.pdf
Lift Ability
 

Recently uploaded (20)

Hypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in itHypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in it
 
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
TEST BANK FOR Health Assessment in Nursing 7th Edition by Weber Chapters 1 - ...
 
Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.
 
Can Allopathy and Homeopathy Be Used Together in India.pdf
Can Allopathy and Homeopathy Be Used Together in India.pdfCan Allopathy and Homeopathy Be Used Together in India.pdf
Can Allopathy and Homeopathy Be Used Together in India.pdf
 
Luxurious Spa In Ajman Chandrima Massage Center
Luxurious Spa In Ajman Chandrima Massage CenterLuxurious Spa In Ajman Chandrima Massage Center
Luxurious Spa In Ajman Chandrima Massage Center
 
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
TEST BANK For Accounting Information Systems, 3rd Edition by Vernon Richardso...
 
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdfHow Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
How Effective is Homeopathic Medicine for Anxiety and Stress Relief.pdf
 
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to CareLGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
LGBTQ+ Adults: Unique Opportunities and Inclusive Approaches to Care
 
Top Rated Massage Center In Ajman Chandrima Spa
Top Rated Massage Center In Ajman Chandrima SpaTop Rated Massage Center In Ajman Chandrima Spa
Top Rated Massage Center In Ajman Chandrima Spa
 
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
定制(wsu毕业证书)美国华盛顿州立大学毕业证学位证书实拍图原版一模一样
 
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
Let's Talk About It: Breast Cancer (What is Mindset and Does it Really Matter?)
 
Gemma Wean- Nutritional solution for Artemia
Gemma Wean- Nutritional solution for ArtemiaGemma Wean- Nutritional solution for Artemia
Gemma Wean- Nutritional solution for Artemia
 
Letter to MREC - application to conduct study
Letter to MREC - application to conduct studyLetter to MREC - application to conduct study
Letter to MREC - application to conduct study
 
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSONNEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
NEEDLE STICK INJURY - JOURNAL CLUB PRESENTATION - DR SHAMIN EABENSON
 
Trauma Outpatient Center .
Trauma Outpatient Center                       .Trauma Outpatient Center                       .
Trauma Outpatient Center .
 
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COMHUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
HUMAN BRAIN.pptx.PRIYA BHOJWANI@GAMIL.COM
 
Vicarious movements or trick movements_AB.pdf
Vicarious movements or trick movements_AB.pdfVicarious movements or trick movements_AB.pdf
Vicarious movements or trick movements_AB.pdf
 
Anxiety, Trauma and Stressor Related Disorder.pptx
Anxiety, Trauma and Stressor Related Disorder.pptxAnxiety, Trauma and Stressor Related Disorder.pptx
Anxiety, Trauma and Stressor Related Disorder.pptx
 
CCSN_June_06 2024_jones. Cancer Rehabpptx
CCSN_June_06 2024_jones. Cancer RehabpptxCCSN_June_06 2024_jones. Cancer Rehabpptx
CCSN_June_06 2024_jones. Cancer Rehabpptx
 
Unlocking the Secrets to Safe Patient Handling.pdf
Unlocking the Secrets to Safe Patient Handling.pdfUnlocking the Secrets to Safe Patient Handling.pdf
Unlocking the Secrets to Safe Patient Handling.pdf
 

BI_LECTURE_4-2021.pptx

  • 1. DATA WARE HOUSING- BUILDING THE BLOCKS: design AND architecture WITH AARONE ATUHE 2020
  • 2.  80% of this lecture is based on Ponniah’s “Datawarehousing fundamentals for IT professionals”  And  A data ware house tool kit by Ralph Kimbal and Maggy Rose, A complete guide to dimensional modelling
  • 3. Information Systems: PROFILE AND ROLE  Information systems are rooted in the relationship between information, decision and control  An IS should collect and classify the information, by means of integrated and suitable procedures, in order to produce in time and at the right levels the synthesis to be used to support the decisional process, as well as to administrate and globally control the enterprise activity
  • 4. Information as a resource  Information is an increasing value resource, required from managers to schedule and monitor effectively the enterprise activities.  Information is the first matter which is transformed by information systems like unfinished products are transformed by manufacturing systems
  • 5. Value of information  Information is an enterprise resource like capital, first matters, plants and people; thus, it has a cost.  Hence, understanding the value of information is important
  • 6. DW and DSS  How are DWs DSS?
  • 7. DW DESIGN COMPONENTS: Granularity  is the extent to which a system is broken down into small parts, either the system itself or its description or observation. It is the extent to which a larger entity is subdivided.  For example, a yard broken into inches has finer granularity than a yard broken into feet.
  • 8. Data granularity  The granularity of data refers to the size in which data fields are sub-divided. For example, a postal address can be recorded, with coarse granularity, as a single field:  address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA or  with fine granularity, as multiple fields:  street address = 200 2nd Ave. South #358  city = St. Petersburg  postal code = FL 33701-4313  country = USA
  • 9. or even finer granularity:  street = 2nd Ave. South  address number = 200  suite/apartment number = #358  city = St. Petersburg  state = FL  postal-code = 33701  postal-code-add-on = 4313  country = USA
  • 10. Data Granularity in DW  In an operational system, data is usually kept at the lowest level of detail.  In a point-of-sale system for a grocery store, the units of sale are captured and stored at the level of units of a product per transaction at the check-out counter.  In an order entry system, the quantity ordered is captured and stored at the level of units of a product per order received from the customer. Whenever you need summary data, you add up the individual transactions.  If you are looking for units of a product ordered this month, you read all the orders entered for the entire month for that product and add up.
  • 11.  Data granularity in a data warehouse refers to the level of detail. The lower the level of detail, the finer is the data granularity. Of course, if you want to keep data in the lowest level of detail, you have to store a lot of data in the data warehouse  In a data warehouse, therefore, you find it efficient to keep data summarized at different levels. Depending on the query, you can then go to the particular level of detail and satisfy the query.
  • 12. Figure below shows examples of data granularity in a typical data warehouse.
  • 13. WHAT ARE OUR CONCERNS IN DW DESISGN?  Before deciding to build a data warehouse for your organization, you need to ask the following basic and fundamental questions and address the relevant issues:  Top-down or bottom-up approach?  Enterprise-wide or departmental?  Which first—data warehouse or data mart?  Build pilot or go with a full-fledged implementation?  Dependent or independent data marts?
  • 14.  Should you build a large data warehouse and then let that repository feed data into local, departmental data marts?  On the other hand, should you build individual local data marts, and combine them to form your overall data warehouse?  Should these local data marts be independent of one another? Or should they be dependent on the overall data warehouse for data feed? THIS MEANS WE NEED TO KNOW MORE ABOUT DW AND DATA MARTS
  • 15. Dw and dm: How Are They Different?  Inmon stated, “The single most important issue facing the IT manager this year is whether to build the data warehouse first or the data mart first.”  Here are the two different basic approaches:  (1) overall data warehouse feeding dependent data marts, and  (2) several departmental or local data marts combining into a data warehouse.  So, which approach is best in your case, the top-down or the bottom-up approach? Let us examine these two approaches carefully.  In the first approach, you extract data from the operational systems; you then transform, clean, integrate, and keep the data in the datawarehouse.
  • 16. Top-Down Approach  In this approach the data in the data warehouse is stored at the lowest level of granularity based on a normalized data model.
  • 17.  This is the big-picture approach in which you build the overall, big, enterprise-wide data warehouse. Here you do not have a collection of fragmented islands of information. The data warehouse is large and integrated.  This approach, however, would take longer to build and has a high risk of failure.  If you do not have experienced professionals on your team, this approach could be hazardous.
  • 18. Bottom-Up Approach  Ralph Kimball, another leading author and expert practitioner in data warehousing, is a proponent of the approach that has come to be known as the bottom-up approach.  Kimball (1996) envisions the corporate data warehouse as a collection of conformed data marts. The key consideration is the conforming of the dimensions among the separate data marts.  In this approach data marts are created first to provide analytical and reporting capabilities for specific business subjects based on the dimensional data model.
  • 19.  Data marts contain data at the lowest level of granularity and also as summaries depending on the needs for analysis. These data marts are joined or “unioned” together by conforming the dimensions  In this bottom-up approach, you build your departmental data marts one by one. You would set a priority scheme to determine which data marts you must build first. The most severe drawback of this approach is data fragmentation. Each independent data mart will be blind to the overall requirements of the entire organization.
  • 20. METADATA IN THE DATA WAREHOUSE  Think of metadata as the Yellow Pages of your town. Do you need information about the stores in your town, where they are, what their names are, and what products they specialize in? Go to the Yellow Pages.  The Yellow Pages is a directory with data about the institutions in your town. Almost in the same manner, the metadata component serves as a directory of the contents of your data warehouse.
  • 21. Types of Metadata  Metadata in a data warehouse fall into three major categories:  Operational metadata  Extraction and transformation metadata  End-user metadata
  • 22. Operational Metadata  the operational metadata is used to explain how the data was created or transformed •Whether the job run failed or had warnings •Which database tables or files were read from, written to, or referenced •How many rows were read, written to, or referenced •When the job started and finished •Which stages and links were used
  • 23. Extraction and Transformation Metadata  Extraction and transformation metadata contain data about the extraction of data from the source systems, namely, the extraction frequencies, extraction methods, and business rules for the data extraction.  Also, this category of metadata contains information about all the data transformations that take place in the data staging area.
  • 24. End-User Metadata  The end-user metadata is the navigational map of the data warehouse.  It enables the end-users to find information from the data warehouse.  The end-user metadata allows the end-users to use their own business terminology and look for information in those ways in which they normally think of the business.
  • 25. Why is metadata especially important in a data warehouse?  First, it acts as the glue that connects all parts of the data warehouse.  Next, it provides information about the contents and structures to the developers.  Finally, it opens the door to the end-users and makes the contents recognizable in their own terms.
  • 26. FACT TABLES. WHAT ARE THEY?  In data warehousing, a Fact table consists of the measurements, metrics or facts of a business process.  It is located at the center of a star schema or a snowflake schema surrounded by dimension tables.  The primary key of a fact table is usually a composite key that is made up of all of its foreign keys.  Fact tables contain the content of the data warehouse and store different types of measures like additive, non additive, and semi additive measures.
  • 27. Fact tables CONTINUED…  A fact table is the primary table in a dimensional model where the numerical performance measurements of the business are stored, We use the term fact to represent a business measure.  We can imagine standing in the marketplace watching products being sold and writing down the quantity sold and dollar sales amount each day for each product in each store  .
  • 28.  A measurement is taken at the intersection of all the dimensions (day, product, and store). This list of dimensions defines the grain of the fact table and tells us what the scope of the measurement is.  The most useful facts are numeric and additive, such as dollar sales amount
  • 30. Dimension Tables  Dimension tables are integral companions to a fact table. The dimension tables contain the textual descriptors of the business.  In a well-designed dimensional model, dimension tables have many columns or attributes. These attributes describe the rows in the dimension table,  Each dimension is defined by its single primary key, designated by the PK notation which serves as the basis for referential integrity with any given fact table to which it is joined.
  • 31.  Dimension attributes serve as the primary source of query constraints, groupings, and report labels. In a query or report request, attributes are identified as the by words.  For example, when a user states that he or she wants to see dollar sales by week by brand, week and brand must be available as dimension attributes.  Dimension table attributes play a vital role in the data warehouse. Since they are the source of virtually all interesting constraints and report labels, they are key to making the data warehouse usable and understandable
  • 33. Bringing Together Facts and Dimensions: Fact and dimension tables in a dimensional model
  • 34. The Process of DataWarehouse Design  A data warehouse can be built using a top-down approach (Starts with overall design and planning), a bottom-up approach (Starts with experiments and prototypes (rapid)), or a combination of both.  From the software engineering point of view, the design and construction of a data warehouse may consist of the following steps
  • 35.  : planning,  requirements study,  problem analysis,  warehouse design,  data integration and testing, and  finally deployment of the data warehouse.  Large software systems can be developed using two methodologies: the waterfall method or the spiral method.
  • 36. Data ware house architectures: Three-Tier Data Warehouse Architecture  Generally the data warehouses adopt the three-tier architecture. Following are the three tiers of data warehouse architecture.  Bottom Tier - The bottom tier of the architecture is the data warehouse database server. It is the relational database system. We use the back end tools and utilities to feed data into bottom tier.
  • 37.  Middle Tier - In the middle tier we have OLAP Server. the OLAP Server can be implemented in either of the following ways.  By relational OLAP (ROLAP), which is an extended relational database management system. The ROLAP maps the operations on multidimensional data to standard relational operations,  By Multidimensional OLAP (MOLAP) model, which directly implements multidimensional data and operations.  Top-Tier - This tier is the front-end client layer. This layer hold the query tools and reporting tool, analysis tools and data mining tools.
  • 38. 7/20/2022 MIS7206-1 Simple Data warehouse Architecture Registration System Exam results System Fees payment System Extract, Transform, Load (ETL) F a c t s Source systems Information Data Warehouse Data Marts Knowledge Knowledge management Statistical analysis Data mining Unstructured information Performance management Staging area
  • 39. 7/20/2022 MIS7202-1 Data warehouse Architecture explained  Source systems: these refer to the different operational systems where data is extracted from  ETL: this refers to the software tool used in extraction transforming and loading of data This could be from source direct to data warehouse or to a staging area data base where data cleaning can be done.
  • 40. 7/20/2022 MIS7202-1  In the data warehouse the data is arranged in a dimensional way with facts and dimensions, Depending on the model used the data warehouse could then be split in data marts to handle specific user needs.  A data mart is a sub section of a data warehouse customized for one business area
  • 42. Three Data Warehouse models  Enterprise warehouse  Data Mart  Virtual warehouse (make very brief notes)
  • 43. Up next-  DW SDLC  DWOlap  Dimensional modelling and schemas  NEXTGENERATION DW