SlideShare a Scribd company logo
1 of 45
simpo313@gmail.com 1
12/07/2023
What is Business Intelligence
 The term Business Intelligence refers collectively to the tools and technologies used for
the collection, integration, analysis, and visualization of data. The raw data which we
collect from different data sources transform into comprehensible data or meaningful
information using BI technologies.
simpo313@gmail.com 2
12/07/2023
 To simplify the concept, we collect raw data from various sources and with the
help of Business Intelligence tools transform it into meaningful information. We
can store such data in data warehouses or data lakes in specific data structures
 From the data warehouses, we can retrieve stored data in the form of a report,
query, make a dashboard to conduct data analysis. We do this with the process
known as ETL (Extract, Transform, Load).
simpo313@gmail.com 3
12/07/2023
So What is Data Warehousing?
 Data warehousing is the process of storing data in data warehouses, which are
databases following the relational database model. Data is selected from
different data sources, aggregated, organized and managed to provide
meaningful insights into data for analysis & queries.
simpo313@gmail.com 4
12/07/2023
 A data warehouse is known by several other terms like Decision Support
System (DSS), Executive Information System, Management Information
System, Business Intelligence Solution, Analytic Application.
 We call it Decision Support System as it provides useful insights and patterns
shown by data as a result of the analysis which makes taking important
decisions in business easy and safe.
simpo313@gmail.com 5
12/07/2023
How does a Data Warehousing Work?
 In data warehousing, data is de-normalized i.e. it is converted to 2NF from 3NF
and hence, is called Big data. We call it big data because of data redundancy
increases and so, data size increases. The sole purpose of creating data
warehouses is to retrieve processed data quickly.
 Also, to provide aggregate data like totals, averages, general trends etc for
enterprises to analyze and make decisions good for their business and functioning
in the industry
simpo313@gmail.com 6
12/07/2023
Components of Data Warehouse
 Operational Systems: These are the different operational domains in an
enterprise which serve a unique purpose and contribute in their ways for the
proper functioning of the enterprise.
 Different operating systems can be marketing, sales, Enterprise Resource
Planning (ERP), etc. All of these systems have their own normalized database
simpo313@gmail.com 7
12/07/2023
Integration Layer:
 The normalized data is present in the operational systems must not be manipulated.
Instead, a copy of that we take data into an integration layer staging area where
manipulate and transform it in specific ways.
 One basic operation done is bringing the copied data into a single standardized
format because, in the operational systems, data is not present in the same format.
For instance, in a data field, the data can be in pounds in one table, and dollars in
another.
simpo313@gmail.com 8
12/07/2023
Data Warehouse:
 The transformed and standardized data flows into the next element, known
as the data warehouse which is a very large database. So, the data stores
from all over the enterprise in this data vault in the second normal form
having a certain uniform format and structure
simpo313@gmail.com 9
12/07/2023
Data Marts:
 These are the purpose-specific sub-databases of the data warehouse containing only some
parts of the entire big data. In each data mart, only that data which is useful for a particular
use is available like there will be different data marts for analysis related to marketing,
finance, administration etc.
 Each of these databases does not coincide or share their data with each other and operations
performed in each of them does not influence the other. This makes fetching data from the
data marts much faster than doing it from the much larger data warehouse.
simpo313@gmail.com 10
12/07/2023
Business Intelligence and Data
Warehousing
 Data warehousing and Business Intelligence often go hand in hand, because the
data made available in the data warehouses are central to the Business
Intelligence tools’ use.
 BI tools like Tableau, Sisense, Chartio, Looker etc, use data from the data
warehouses for purposes like query, reporting, analytics, and data mining.
simpo313@gmail.com 11
12/07/2023
In any enterprise, Business Intelligence plays a central
role in the smooth and cost-effective functioning of it
 BI is helpful in operational efficiency which includes ERP
reporting, KPI tracking, risk management, product profitability, costing,
logistics etc.
 helps in customer interaction which includes, sales analysis, sales forecasting,
segmentation, campaign planning, customer profitability etc.
simpo313@gmail.com 12
12/07/2023
 . From our prior discussions, we know that data warehouses store processed
and aggregated data. Business Intelligence tools require such data from the
data warehouses.
 The data is transported through the Online Analytical Processing (OLAP). Data
warehousing and OLAP has proved to be a much-needed jump from the old
decision-making apps which used OLTP.
simpo313@gmail.com 13
12/07/2023
simpo313@gmail.com 14
12/07/2023
– Architecture and Process of data
warehousing and BI
 In this section, we will see how to extract, transform and load raw data into
data warehouses. Also, we discuss how BI tools use it for analytical purposes.
Refer to the image given below, to understand the process better
simpo313@gmail.com 15
12/07/2023
simpo313@gmail.com 16
12/07/2023
 Step 1: Extracting raw data from data sources like traditional data, workbooks, excel
files etc.
 Step 2: The raw data that is collected from different data sources are consolidated
and integrated to be stored in a special database called a data warehouse The process
by which we fetch the data into data warehouses from the source is ETL (Extract,
Transform, Load). This extracts raw data from the original sources, transforms or
manipulates it different ways and loads it into the data warehouse.
simpo313@gmail.com 17
12/07/2023
 Step 3: If you wish to use data from the data warehouse for specific purposes
like marketing analysis, financial analysis etc., subsets of the data warehouse
are created known as data marts and data cubes. Data from the data
warehouse to the data marts also goes through the ETL.
simpo313@gmail.com 18
12/07/2023
 Step 4: From both data warehouse and data marts, data is redirected to data
or OLAP cubes which are multi-dimensional data sets whose data is ready to
be used by front-end BI tools or clients.
 At the front-end, exists BI tools such as query tools, reporting, analysis,
and data mining. These BI tools query data from OLAP cubes and use it for
analysis.
simpo313@gmail.com 19
12/07/2023
Summary
 Thus, Business Intelligence and Data Warehousing are two important pillars in the
survival of an enterprise. It helps to keep a check on critical elements like CRM,
ERP, supply chain, products, and customers.
 The Business Intelligence and Data Warehousing technologies give accurate,
comprehensive, integrated and up-to-date information on the current situation of
an enterprise which supports taking required steps and making important
decisions for the company’s growth
simpo313@gmail.com 20
12/07/2023
Understanding Data Warehouse-its features
 Definition
 Data warehouse is Subject Oriented, Integrated, Time-Variant and Non-volatile
collection of data that support management's decision making process.
Food for thought.
 “what is e difference between data warehouses and Operational Databases?”
simpo313@gmail.com 21
12/07/2023
Data Warehouse—Subject-Oriented
 Organized around major subjects, such as customer, product, sales,
employees.
 This subject specific design helps in reducing the query response
time by searching through very few records to get an answer to the
user‟s question.
simpo313@gmail.com 22
12/07/2023
Data Warehouse—Integrated
 Constructed by integrating multiple, heterogeneous data sources
 relational databases, flat files, on-line transaction records
 Data cleaning and data integration techniques are applied.
 Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among
different data sources
 E.g., When short listing your top 20 customers, you must know that “HAL” and “Hindustan Aeronautics
Limited” are one and the same. Much of the transformation and loading work that goes into the data
warehouse is centered on integrating data and standardizing it.
simpo313@gmail.com 23
12/07/2023
Data Warehouse—Time Variant/time
referenced data
 The time horizon for the data warehouse is significantly longer than
that of operational systems.
 Operational database: current value data.
 Data warehouse data: provide information from a historical perspective (e.g.,
past 5-10 years)
 For example, the user may ask “What were the total sales of product
A for the past three years on New Year’s Day across region Y ‟?”
simpo313@gmail.com 24
12/07/2023
And………
 Time-referenced data when analyzed can also help in spotting the hidden
trends between different associative data elements, which may not be
obvious to the naked eye. This exploration activity is termed “data mining”.
simpo313@gmail.com 25
12/07/2023
Data Warehouse—Non-Volatile
 Once data is in, it will not change, historical data in DW
should never be changed. This enables management to get a
consistent picture of the business
simpo313@gmail.com 26
12/07/2023
Metadata
 Metadata is simply defined as data about data. For example the index of a book serve
as metadata for the contents in the book. In other words we can say that metadata
is the summarized data that lead us to the detailed data.
 In terms of data warehouse we can understand metadata as following:
 Metadata is a road map to data warehouse.
 The metadata act as a directory. This directory helps the decision support system to
locate the contents of data warehouse.
simpo313@gmail.com 27
12/07/2023
Metadata Respiratory
 The Metadata Respiratory is an integral part of data warehouse system. The
Metadata Respiratory contains the following metadata:
 Business Metadata - This metadata has the data ownership information, business
definition and changing policies.
 Operational Metadata -This metadata includes currency of data and data lineage.
Currency of data means whether data is active, archived. Lineage of data means
history of data migrated and transformation applied on it.
simpo313@gmail.com 28
12/07/2023
 Data for mapping from operational environment to data warehouse -This
metadata includes source databases and their contents, data extraction, data
partition, cleaning, transformation rules, data refresh and purging rules.
 The algorithms for summarization - This includes dimension algorithms, data
on granularity, aggregation, summarizing etc.
simpo313@gmail.com 29
12/07/2023
Data cube
 Data cube help us to represent the data in multiple dimensions. The data cube is defined by
dimensions and facts.
Illustration of Data cube
 Suppose a company wants to keep track of sales records with help of sales data warehouse with
respect to time, item, branch and location. These dimensions allow to keep track of monthly
sales and at which branch the items were sold. There is a dimension table table associated
with each dimension. This dimension table further describes the dimensions. For example
"item" dimension table may have attributes such as item_name, item_type and item_brand.
simpo313@gmail.com 30
12/07/2023
 The following table represents 2-D view of Sales Data for a company with
respect to time, item and location dimensions
simpo313@gmail.com 31
12/07/2023
But here in this 2-D table we have records with respect to time and item only.
The sales for New Delhi are shown with respect to time and item dimensions
according to type of item sold.
If we want to view the sales data with one new dimension say the location
dimension. The 3-D view of the sales data with respect to time, item, and
location is shown in the table below:
simpo313@gmail.com 32
12/07/2023
simpo313@gmail.com 33
12/07/2023
The above 3-D table can be represented as 3-D data cube as
shown in the following figure:
simpo313@gmail.com 34
12/07/2023
Data mart
 Data mart contains the subset of organisation-wide data. This subset of data is valuable to
specific group of an organisation. in other words we can say that data mart contains only that
data which is specific to a particular group.
 For example the marketing data mart may contain only data related to item, customers
and sales. The data mart are confined to subjects.
simpo313@gmail.com 35
12/07/2023
Points to remember about data marts:
 Data mart are small in size.
 Data mart are customized by department.
 The source of data mart is departmentally structured data warehouse.
 Data mart are flexible.
simpo313@gmail.com 36
12/07/2023
Graphical Representation of data mart.
simpo313@gmail.com 37
12/07/2023
Process Flow in Data Warehouse:
The ETL Process
 Everyone understands the three letters:
 You get the data out of its original source location (E), you
do something to it(T), and then you load it (L) into a final
set of tables for the business users to query
simpo313@gmail.com 38
12/07/2023
THE ETL Process
 Extract, transform, and load (ETL) is a process in data warehousing that
involves:
 extracting data from sources systems; (these are the (OLTP) On Line Transaction
Processes)
 transforming the extracted data to match business needs.
 loading the transformed into the data warehouse
39
simpo313@gmail.com 12/07/2023
Extract
 The first part of an ETL process is to extract data from the source systems.
Data warehousing projects consolidate data from different source systems.
Each separate system may also use a different data format.
40
simpo313@gmail.com 12/07/2023
Transform
 This phase applies a series of rules or functions to the extracted data to
derive the required data format to be loaded in the data warehouse.
 Some data sources will require very little manipulation of data. In other
cases, one or more of the following transformations types may be required
7/12/2023 41
simpo313@gmail.com
POSSIBLE DATA TRANSFORMATIONS
1. Selecting only certain columns to load (or selecting null columns not to
load)
2. Translating coded values (e.g., if the source system stores M for male and F
for female, but the warehouse stores 1 for male and 2 for female)
3. Deriving a new calculated value (e.g., sale_amount = qty * unit_price)
7/12/2023 42
simpo313@gmail.com
 Summarizing multiple rows of data (e.g., total sales for each region)
 Joining together data from multiple sources (e.g., lookup, merge,
etc.)
 Splitting a column into multiple columns (e.g., putting a comma-
separated list specified as a string in one column as individual values
in different columns)
 Generating surrogate key values
7/12/2023 43
simpo313@gmail.com
Transformation types
 Data must be merged from different systems, e.g. one source may store the
same information with a different structure.
 Data must be scrubbed for inconsistencies in e.g. spelling errors or
variations. It is a good idea to use surrogate keys: keys maintained at the
data warehouse that are independent of keys from the data sources.
 Data must be pre-aggregated for faster analysis.
44
simpo313@gmail.com 12/07/2023
Load
 The load phase loads the data into the data warehouse. The data loaded can
be used to support BI eg for reporting purposes
7/12/2023 45
simpo313@gmail.com

More Related Content

Similar to Understanding the Architecture and Process of Data Warehousing and Business Intelligence

Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345AkhilSinghal21
 
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfDatacademy.ai
 
the process of transforming data into in
the process of transforming data into inthe process of transforming data into in
the process of transforming data into inNISHANTHM64
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environmentSasha Citino
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureInfosys
 
Dw hk-white paper
Dw hk-white paperDw hk-white paper
Dw hk-white paperjuly12jana
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
 
Types of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizTypes of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizKavika Roy
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCEcscpconf
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 

Similar to Understanding the Architecture and Process of Data Warehousing and Business Intelligence (20)

Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
 
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdf
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
the process of transforming data into in
the process of transforming data into inthe process of transforming data into in
the process of transforming data into in
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environment
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Dw hk-white paper
Dw hk-white paperDw hk-white paper
Dw hk-white paper
 
Big data vs datawarehousing
Big data vs datawarehousingBig data vs datawarehousing
Big data vs datawarehousing
 
Big data vs datawarehousing
Big data vs datawarehousingBig data vs datawarehousing
Big data vs datawarehousing
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
 
Types of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBizTypes of Data Engineering Services - By DataToBiz
Types of Data Engineering Services - By DataToBiz
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
 
Offers bank dss
Offers bank dssOffers bank dss
Offers bank dss
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Unit 1
Unit 1Unit 1
Unit 1
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 

More from AmanyaLaban

Lecture_6_E-commerce_Infrastructure.pptx
Lecture_6_E-commerce_Infrastructure.pptxLecture_6_E-commerce_Infrastructure.pptx
Lecture_6_E-commerce_Infrastructure.pptxAmanyaLaban
 
Lecture_5_Social_Media_Marketing.pdf
Lecture_5_Social_Media_Marketing.pdfLecture_5_Social_Media_Marketing.pdf
Lecture_5_Social_Media_Marketing.pdfAmanyaLaban
 
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptx
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptxLecture_four_Digital_Marketing_Campaigns_Optimization.pptx
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptxAmanyaLaban
 
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdf
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdfLecture_three_Digital_Customer_and_Internet_advertising-min.pdf
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdfAmanyaLaban
 
Lecture_two_Digital_marketing.pptx
Lecture_two_Digital_marketing.pptxLecture_two_Digital_marketing.pptx
Lecture_two_Digital_marketing.pptxAmanyaLaban
 
Lecture_one.pptx
Lecture_one.pptxLecture_one.pptx
Lecture_one.pptxAmanyaLaban
 
BIT3114 lecture6 WAN.ppt
BIT3114 lecture6 WAN.pptBIT3114 lecture6 WAN.ppt
BIT3114 lecture6 WAN.pptAmanyaLaban
 

More from AmanyaLaban (8)

Lecture_6_E-commerce_Infrastructure.pptx
Lecture_6_E-commerce_Infrastructure.pptxLecture_6_E-commerce_Infrastructure.pptx
Lecture_6_E-commerce_Infrastructure.pptx
 
Lecture_5_Social_Media_Marketing.pdf
Lecture_5_Social_Media_Marketing.pdfLecture_5_Social_Media_Marketing.pdf
Lecture_5_Social_Media_Marketing.pdf
 
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptx
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptxLecture_four_Digital_Marketing_Campaigns_Optimization.pptx
Lecture_four_Digital_Marketing_Campaigns_Optimization.pptx
 
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdf
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdfLecture_three_Digital_Customer_and_Internet_advertising-min.pdf
Lecture_three_Digital_Customer_and_Internet_advertising-min.pdf
 
Lecture_two_Digital_marketing.pptx
Lecture_two_Digital_marketing.pptxLecture_two_Digital_marketing.pptx
Lecture_two_Digital_marketing.pptx
 
Lecture_one.pptx
Lecture_one.pptxLecture_one.pptx
Lecture_one.pptx
 
BIT3114 lecture6 WAN.ppt
BIT3114 lecture6 WAN.pptBIT3114 lecture6 WAN.ppt
BIT3114 lecture6 WAN.ppt
 
networking
networking networking
networking
 

Recently uploaded

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insightsseri bangash
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...Suhani Kapoor
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...lizamodels9
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfOnline Income Engine
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaShree Krishna Exports
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data
 

Recently uploaded (20)

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdf
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in India
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 

Understanding the Architecture and Process of Data Warehousing and Business Intelligence

  • 2. What is Business Intelligence  The term Business Intelligence refers collectively to the tools and technologies used for the collection, integration, analysis, and visualization of data. The raw data which we collect from different data sources transform into comprehensible data or meaningful information using BI technologies. simpo313@gmail.com 2 12/07/2023
  • 3.  To simplify the concept, we collect raw data from various sources and with the help of Business Intelligence tools transform it into meaningful information. We can store such data in data warehouses or data lakes in specific data structures  From the data warehouses, we can retrieve stored data in the form of a report, query, make a dashboard to conduct data analysis. We do this with the process known as ETL (Extract, Transform, Load). simpo313@gmail.com 3 12/07/2023
  • 4. So What is Data Warehousing?  Data warehousing is the process of storing data in data warehouses, which are databases following the relational database model. Data is selected from different data sources, aggregated, organized and managed to provide meaningful insights into data for analysis & queries. simpo313@gmail.com 4 12/07/2023
  • 5.  A data warehouse is known by several other terms like Decision Support System (DSS), Executive Information System, Management Information System, Business Intelligence Solution, Analytic Application.  We call it Decision Support System as it provides useful insights and patterns shown by data as a result of the analysis which makes taking important decisions in business easy and safe. simpo313@gmail.com 5 12/07/2023
  • 6. How does a Data Warehousing Work?  In data warehousing, data is de-normalized i.e. it is converted to 2NF from 3NF and hence, is called Big data. We call it big data because of data redundancy increases and so, data size increases. The sole purpose of creating data warehouses is to retrieve processed data quickly.  Also, to provide aggregate data like totals, averages, general trends etc for enterprises to analyze and make decisions good for their business and functioning in the industry simpo313@gmail.com 6 12/07/2023
  • 7. Components of Data Warehouse  Operational Systems: These are the different operational domains in an enterprise which serve a unique purpose and contribute in their ways for the proper functioning of the enterprise.  Different operating systems can be marketing, sales, Enterprise Resource Planning (ERP), etc. All of these systems have their own normalized database simpo313@gmail.com 7 12/07/2023
  • 8. Integration Layer:  The normalized data is present in the operational systems must not be manipulated. Instead, a copy of that we take data into an integration layer staging area where manipulate and transform it in specific ways.  One basic operation done is bringing the copied data into a single standardized format because, in the operational systems, data is not present in the same format. For instance, in a data field, the data can be in pounds in one table, and dollars in another. simpo313@gmail.com 8 12/07/2023
  • 9. Data Warehouse:  The transformed and standardized data flows into the next element, known as the data warehouse which is a very large database. So, the data stores from all over the enterprise in this data vault in the second normal form having a certain uniform format and structure simpo313@gmail.com 9 12/07/2023
  • 10. Data Marts:  These are the purpose-specific sub-databases of the data warehouse containing only some parts of the entire big data. In each data mart, only that data which is useful for a particular use is available like there will be different data marts for analysis related to marketing, finance, administration etc.  Each of these databases does not coincide or share their data with each other and operations performed in each of them does not influence the other. This makes fetching data from the data marts much faster than doing it from the much larger data warehouse. simpo313@gmail.com 10 12/07/2023
  • 11. Business Intelligence and Data Warehousing  Data warehousing and Business Intelligence often go hand in hand, because the data made available in the data warehouses are central to the Business Intelligence tools’ use.  BI tools like Tableau, Sisense, Chartio, Looker etc, use data from the data warehouses for purposes like query, reporting, analytics, and data mining. simpo313@gmail.com 11 12/07/2023
  • 12. In any enterprise, Business Intelligence plays a central role in the smooth and cost-effective functioning of it  BI is helpful in operational efficiency which includes ERP reporting, KPI tracking, risk management, product profitability, costing, logistics etc.  helps in customer interaction which includes, sales analysis, sales forecasting, segmentation, campaign planning, customer profitability etc. simpo313@gmail.com 12 12/07/2023
  • 13.  . From our prior discussions, we know that data warehouses store processed and aggregated data. Business Intelligence tools require such data from the data warehouses.  The data is transported through the Online Analytical Processing (OLAP). Data warehousing and OLAP has proved to be a much-needed jump from the old decision-making apps which used OLTP. simpo313@gmail.com 13 12/07/2023
  • 15. – Architecture and Process of data warehousing and BI  In this section, we will see how to extract, transform and load raw data into data warehouses. Also, we discuss how BI tools use it for analytical purposes. Refer to the image given below, to understand the process better simpo313@gmail.com 15 12/07/2023
  • 17.  Step 1: Extracting raw data from data sources like traditional data, workbooks, excel files etc.  Step 2: The raw data that is collected from different data sources are consolidated and integrated to be stored in a special database called a data warehouse The process by which we fetch the data into data warehouses from the source is ETL (Extract, Transform, Load). This extracts raw data from the original sources, transforms or manipulates it different ways and loads it into the data warehouse. simpo313@gmail.com 17 12/07/2023
  • 18.  Step 3: If you wish to use data from the data warehouse for specific purposes like marketing analysis, financial analysis etc., subsets of the data warehouse are created known as data marts and data cubes. Data from the data warehouse to the data marts also goes through the ETL. simpo313@gmail.com 18 12/07/2023
  • 19.  Step 4: From both data warehouse and data marts, data is redirected to data or OLAP cubes which are multi-dimensional data sets whose data is ready to be used by front-end BI tools or clients.  At the front-end, exists BI tools such as query tools, reporting, analysis, and data mining. These BI tools query data from OLAP cubes and use it for analysis. simpo313@gmail.com 19 12/07/2023
  • 20. Summary  Thus, Business Intelligence and Data Warehousing are two important pillars in the survival of an enterprise. It helps to keep a check on critical elements like CRM, ERP, supply chain, products, and customers.  The Business Intelligence and Data Warehousing technologies give accurate, comprehensive, integrated and up-to-date information on the current situation of an enterprise which supports taking required steps and making important decisions for the company’s growth simpo313@gmail.com 20 12/07/2023
  • 21. Understanding Data Warehouse-its features  Definition  Data warehouse is Subject Oriented, Integrated, Time-Variant and Non-volatile collection of data that support management's decision making process. Food for thought.  “what is e difference between data warehouses and Operational Databases?” simpo313@gmail.com 21 12/07/2023
  • 22. Data Warehouse—Subject-Oriented  Organized around major subjects, such as customer, product, sales, employees.  This subject specific design helps in reducing the query response time by searching through very few records to get an answer to the user‟s question. simpo313@gmail.com 22 12/07/2023
  • 23. Data Warehouse—Integrated  Constructed by integrating multiple, heterogeneous data sources  relational databases, flat files, on-line transaction records  Data cleaning and data integration techniques are applied.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources  E.g., When short listing your top 20 customers, you must know that “HAL” and “Hindustan Aeronautics Limited” are one and the same. Much of the transformation and loading work that goes into the data warehouse is centered on integrating data and standardizing it. simpo313@gmail.com 23 12/07/2023
  • 24. Data Warehouse—Time Variant/time referenced data  The time horizon for the data warehouse is significantly longer than that of operational systems.  Operational database: current value data.  Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years)  For example, the user may ask “What were the total sales of product A for the past three years on New Year’s Day across region Y ‟?” simpo313@gmail.com 24 12/07/2023
  • 25. And………  Time-referenced data when analyzed can also help in spotting the hidden trends between different associative data elements, which may not be obvious to the naked eye. This exploration activity is termed “data mining”. simpo313@gmail.com 25 12/07/2023
  • 26. Data Warehouse—Non-Volatile  Once data is in, it will not change, historical data in DW should never be changed. This enables management to get a consistent picture of the business simpo313@gmail.com 26 12/07/2023
  • 27. Metadata  Metadata is simply defined as data about data. For example the index of a book serve as metadata for the contents in the book. In other words we can say that metadata is the summarized data that lead us to the detailed data.  In terms of data warehouse we can understand metadata as following:  Metadata is a road map to data warehouse.  The metadata act as a directory. This directory helps the decision support system to locate the contents of data warehouse. simpo313@gmail.com 27 12/07/2023
  • 28. Metadata Respiratory  The Metadata Respiratory is an integral part of data warehouse system. The Metadata Respiratory contains the following metadata:  Business Metadata - This metadata has the data ownership information, business definition and changing policies.  Operational Metadata -This metadata includes currency of data and data lineage. Currency of data means whether data is active, archived. Lineage of data means history of data migrated and transformation applied on it. simpo313@gmail.com 28 12/07/2023
  • 29.  Data for mapping from operational environment to data warehouse -This metadata includes source databases and their contents, data extraction, data partition, cleaning, transformation rules, data refresh and purging rules.  The algorithms for summarization - This includes dimension algorithms, data on granularity, aggregation, summarizing etc. simpo313@gmail.com 29 12/07/2023
  • 30. Data cube  Data cube help us to represent the data in multiple dimensions. The data cube is defined by dimensions and facts. Illustration of Data cube  Suppose a company wants to keep track of sales records with help of sales data warehouse with respect to time, item, branch and location. These dimensions allow to keep track of monthly sales and at which branch the items were sold. There is a dimension table table associated with each dimension. This dimension table further describes the dimensions. For example "item" dimension table may have attributes such as item_name, item_type and item_brand. simpo313@gmail.com 30 12/07/2023
  • 31.  The following table represents 2-D view of Sales Data for a company with respect to time, item and location dimensions simpo313@gmail.com 31 12/07/2023
  • 32. But here in this 2-D table we have records with respect to time and item only. The sales for New Delhi are shown with respect to time and item dimensions according to type of item sold. If we want to view the sales data with one new dimension say the location dimension. The 3-D view of the sales data with respect to time, item, and location is shown in the table below: simpo313@gmail.com 32 12/07/2023
  • 34. The above 3-D table can be represented as 3-D data cube as shown in the following figure: simpo313@gmail.com 34 12/07/2023
  • 35. Data mart  Data mart contains the subset of organisation-wide data. This subset of data is valuable to specific group of an organisation. in other words we can say that data mart contains only that data which is specific to a particular group.  For example the marketing data mart may contain only data related to item, customers and sales. The data mart are confined to subjects. simpo313@gmail.com 35 12/07/2023
  • 36. Points to remember about data marts:  Data mart are small in size.  Data mart are customized by department.  The source of data mart is departmentally structured data warehouse.  Data mart are flexible. simpo313@gmail.com 36 12/07/2023
  • 37. Graphical Representation of data mart. simpo313@gmail.com 37 12/07/2023
  • 38. Process Flow in Data Warehouse: The ETL Process  Everyone understands the three letters:  You get the data out of its original source location (E), you do something to it(T), and then you load it (L) into a final set of tables for the business users to query simpo313@gmail.com 38 12/07/2023
  • 39. THE ETL Process  Extract, transform, and load (ETL) is a process in data warehousing that involves:  extracting data from sources systems; (these are the (OLTP) On Line Transaction Processes)  transforming the extracted data to match business needs.  loading the transformed into the data warehouse 39 simpo313@gmail.com 12/07/2023
  • 40. Extract  The first part of an ETL process is to extract data from the source systems. Data warehousing projects consolidate data from different source systems. Each separate system may also use a different data format. 40 simpo313@gmail.com 12/07/2023
  • 41. Transform  This phase applies a series of rules or functions to the extracted data to derive the required data format to be loaded in the data warehouse.  Some data sources will require very little manipulation of data. In other cases, one or more of the following transformations types may be required 7/12/2023 41 simpo313@gmail.com
  • 42. POSSIBLE DATA TRANSFORMATIONS 1. Selecting only certain columns to load (or selecting null columns not to load) 2. Translating coded values (e.g., if the source system stores M for male and F for female, but the warehouse stores 1 for male and 2 for female) 3. Deriving a new calculated value (e.g., sale_amount = qty * unit_price) 7/12/2023 42 simpo313@gmail.com
  • 43.  Summarizing multiple rows of data (e.g., total sales for each region)  Joining together data from multiple sources (e.g., lookup, merge, etc.)  Splitting a column into multiple columns (e.g., putting a comma- separated list specified as a string in one column as individual values in different columns)  Generating surrogate key values 7/12/2023 43 simpo313@gmail.com
  • 44. Transformation types  Data must be merged from different systems, e.g. one source may store the same information with a different structure.  Data must be scrubbed for inconsistencies in e.g. spelling errors or variations. It is a good idea to use surrogate keys: keys maintained at the data warehouse that are independent of keys from the data sources.  Data must be pre-aggregated for faster analysis. 44 simpo313@gmail.com 12/07/2023
  • 45. Load  The load phase loads the data into the data warehouse. The data loaded can be used to support BI eg for reporting purposes 7/12/2023 45 simpo313@gmail.com