Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Agenda
 What Is Artificial Intelligence ?
 What Is Machine Learning ?
 Limitations Of Machine Learning
 Deep Learning To The Rescue
 What Is Deep Learning ?
 Deep Learning Applications
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Agenda
 What Is The Need For BI?
 What Is Data Warehousing?
 Key Terminologies Related To DWH Architecture
 OLTP Vs OLAP
 ETL
 Data Mart
 Metadata
 DWH Architecture
 Demo: Creating A DWH
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why BI & Data Warehousing?
Let’s understand why Business Intelligence & Data
Warehousing are the foundation for any company’s success.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why Business Intelligence?
Business Intelligence is the activity which contributes to the growth of any company.
Planning
Data
Gathering
Data
Analysis
Business
Action
Business Growth
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Business Intelligence?
BI is the act of transforming raw/ operational data into useful information for business analysis.
1. BI based on Data Warehouse technology extracts information from a company's operational systems.
2. The data is transformed (cleaned and integrated), and loaded into Data Warehouses.
3. Since this data is credible, it is used for business insights.
How Does It Work?
DB
1
DB
2
Data Warehouse
End Users
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
But Why Data Warehousing?
Let’s understand the challenges in achieving
Business Intelligence
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why Data Warehouse?
➢ Data collected from various sources & stored in various databases cannot be directly visualized.
➢ The data first needs to be integrated and then processed before visualization takes place.
DataSources
Data Visualization
Data
Warehouse
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Let’s Understand What Is A
Data Warehouse
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is A Data Warehouse?
➢ A central location where consolidated data from multiple locations (databases) are stored.
➢ DWH is maintained separately from an organization’s operational database.
➢ End users access it whenever any information is needed.
➢ Note:- Data Warehouse is not loaded every time new data is added to database.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Are The Advantages Of A Data Warehouse?
➢ Strategic questions can be answered by studying trends.
➢ Data Warehousing is faster and more accurate.
➢ Note:- Data Warehouse is not a product that a company can go and purchase, it needs to be designed & depends
entirely on the company’s requirement.
Data Warehouse
Take the data from
operational systems
Integrate the data
from multiple sources
Standardize the data &
remove inconsistencies
Store the data in format
suitable for easy access
Result
Query
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Properties Of A Data Warehouse
Integrated
Subject-oriented
“A Data Warehouse is a subject-oriented, integrated, time-variant and nonvolatile collection of data in support of
management’s decision-making process.” -Bill Inmon, Father of Data Warehousing
Data is categorized and stored by business subject rather than by application.
Data on a given subject is collected from disparate sources and stored in a single place.
Time-variant
Data is stored as a series of snapshots, each representing a period of time.
Non-volatile
Typically data in the data warehouse is not updated or deleted.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Key Terminologies
Let’s understand some of the key terminologies related to
Data Warehousing
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Information Systems:- OLTP (DB) vs. OLAP (DWH)
Relational Database (OLTP) Analytical Data Warehouse (OLAP)
Contains current data Contains historical data
Useful in running the business Useful in analyzing the business
Based on Entity Relationship Model Based on Star, Snowflake and Fact Constellation Schema
Provides primitive and highly detailed data Provides summarized and consolidated data
Used for writing data into the database Used for reading data from the data warehouse
Database size ranges from 100 MB to 1 GB Data Warehouse size ranges from 100 GB to 1 TB
Fast; provides high performance Highly flexible; but not fast
Number of records accessed is in tens Number of records accessed is in millions
Ex: All bank transactions made by a customer Ex: Bank transactions made by a customer at a particular time.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Information Systems:- OLTP (DB) vs. OLAP (DWH)
OLTP Examples:
1. A supermarket server which records every single product purchased at that market.
2. A bank server which records every time a transaction is made for a particular account.
3. A railway reservation server which records the transactions of a passenger.
OLAP Examples:
1. Bank Manager wants to know how many customers are utilizing the ATM of his branch. Based on this he may take
a call whether to continue with the ATM or relocate it.
2. An insurance company wants to know the number of policies each agent has sold. This will help in better
performance management of agents.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
ETL  Extract, Transform & Load
ETL is the process of extracting the data from various sources, transforming this data to meet your requirement and
then loading it into a target data warehouse.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Data Mart
➢ Data mart is a smaller version of the Data Warehouse which deals with a single subject
➢ Data marts are focused on one area. Hence, they draw data from a limited number of sources
➢ Time taken to build Data Marts is very less compared to the time taken to build a Data Warehouse
Data Warehouse Data Marts
Enterprise wide data Department wide data
Multiple subject areas Single subject area
Multiple data sources Limited data sources
Occupies large memory Occupies limited memory
Longer time to implement Shorter time to implement
Data
Warehouse
Sales
Data
Marketing
Data
Operations
Data
Data Mart 1
Data Mart 2
Data Mart 3
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Types Of Data Mart
1. Dependent Data Mart
➢ The data is first extracted from the OLTP systems and then
populated in the central DWH
➢ From the DWH, the data travels to the Data Mart
2. Independent Data Mart
➢ The data is directly received from the source system
➢ This is suitable for small organizations or smaller groups within an
organization
3. Hybrid Data Mart
➢ The data is fed both from OLTP systems as well as the Data
Warehouse
OLTP
Source
Data
Warehouse
Data Mart
OLTP
Source
Data Mart
OLTP
Source
Data
Warehouse
Data Mart
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Metadata
➢ Metadata is defined as data about data.
➢ Metadata in a DWH defines the source data i.e. Flat File, Relational Database and other objects.
➢ Metadata is used to define which table is source and target, and which concept is used to build business logic called
transformation to the actual output.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Data Warehouse Architecture
Now let’s understand the architecture Data
Warehousing is based on.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Data Warehouse Architecture
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Demo:-
Populating A Data Warehouse
By using Talend, let’s see how we can import data from
various data sources and create a DWH
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Hands-On:-
Problem Statement:
As a retail organization, you have details of 10,000 customer and 50,000 transactions. With this data you wish to find out
Customers who have low number of purchases.
Data
Warehouse
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Session In A Minute
Why Business Intelligence ? What is Business Intelligence? What Is Data Warehousing?
Key terminologies in DWH Data Warehousing architecture Hands-On
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Warehousing | Edureka

Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Warehousing | Edureka

  • 1.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Agenda  What Is Artificial Intelligence ?  What Is Machine Learning ?  Limitations Of Machine Learning  Deep Learning To The Rescue  What Is Deep Learning ?  Deep Learning Applications
  • 2.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Agenda  What Is The Need For BI?  What Is Data Warehousing?  Key Terminologies Related To DWH Architecture  OLTP Vs OLAP  ETL  Data Mart  Metadata  DWH Architecture  Demo: Creating A DWH
  • 3.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Why BI & Data Warehousing? Let’s understand why Business Intelligence & Data Warehousing are the foundation for any company’s success.
  • 4.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Why Business Intelligence? Business Intelligence is the activity which contributes to the growth of any company. Planning Data Gathering Data Analysis Business Action Business Growth
  • 5.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. What Is Business Intelligence? BI is the act of transforming raw/ operational data into useful information for business analysis. 1. BI based on Data Warehouse technology extracts information from a company's operational systems. 2. The data is transformed (cleaned and integrated), and loaded into Data Warehouses. 3. Since this data is credible, it is used for business insights. How Does It Work? DB 1 DB 2 Data Warehouse End Users
  • 6.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. But Why Data Warehousing? Let’s understand the challenges in achieving Business Intelligence
  • 7.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Why Data Warehouse? ➢ Data collected from various sources & stored in various databases cannot be directly visualized. ➢ The data first needs to be integrated and then processed before visualization takes place. DataSources Data Visualization Data Warehouse
  • 8.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Let’s Understand What Is A Data Warehouse
  • 9.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. What Is A Data Warehouse? ➢ A central location where consolidated data from multiple locations (databases) are stored. ➢ DWH is maintained separately from an organization’s operational database. ➢ End users access it whenever any information is needed. ➢ Note:- Data Warehouse is not loaded every time new data is added to database.
  • 10.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. What Are The Advantages Of A Data Warehouse? ➢ Strategic questions can be answered by studying trends. ➢ Data Warehousing is faster and more accurate. ➢ Note:- Data Warehouse is not a product that a company can go and purchase, it needs to be designed & depends entirely on the company’s requirement. Data Warehouse Take the data from operational systems Integrate the data from multiple sources Standardize the data & remove inconsistencies Store the data in format suitable for easy access Result Query
  • 11.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Properties Of A Data Warehouse Integrated Subject-oriented “A Data Warehouse is a subject-oriented, integrated, time-variant and nonvolatile collection of data in support of management’s decision-making process.” -Bill Inmon, Father of Data Warehousing Data is categorized and stored by business subject rather than by application. Data on a given subject is collected from disparate sources and stored in a single place. Time-variant Data is stored as a series of snapshots, each representing a period of time. Non-volatile Typically data in the data warehouse is not updated or deleted.
  • 12.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Key Terminologies Let’s understand some of the key terminologies related to Data Warehousing
  • 13.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Information Systems:- OLTP (DB) vs. OLAP (DWH) Relational Database (OLTP) Analytical Data Warehouse (OLAP) Contains current data Contains historical data Useful in running the business Useful in analyzing the business Based on Entity Relationship Model Based on Star, Snowflake and Fact Constellation Schema Provides primitive and highly detailed data Provides summarized and consolidated data Used for writing data into the database Used for reading data from the data warehouse Database size ranges from 100 MB to 1 GB Data Warehouse size ranges from 100 GB to 1 TB Fast; provides high performance Highly flexible; but not fast Number of records accessed is in tens Number of records accessed is in millions Ex: All bank transactions made by a customer Ex: Bank transactions made by a customer at a particular time.
  • 14.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Information Systems:- OLTP (DB) vs. OLAP (DWH) OLTP Examples: 1. A supermarket server which records every single product purchased at that market. 2. A bank server which records every time a transaction is made for a particular account. 3. A railway reservation server which records the transactions of a passenger. OLAP Examples: 1. Bank Manager wants to know how many customers are utilizing the ATM of his branch. Based on this he may take a call whether to continue with the ATM or relocate it. 2. An insurance company wants to know the number of policies each agent has sold. This will help in better performance management of agents.
  • 15.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. ETL  Extract, Transform & Load ETL is the process of extracting the data from various sources, transforming this data to meet your requirement and then loading it into a target data warehouse.
  • 16.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Data Mart ➢ Data mart is a smaller version of the Data Warehouse which deals with a single subject ➢ Data marts are focused on one area. Hence, they draw data from a limited number of sources ➢ Time taken to build Data Marts is very less compared to the time taken to build a Data Warehouse Data Warehouse Data Marts Enterprise wide data Department wide data Multiple subject areas Single subject area Multiple data sources Limited data sources Occupies large memory Occupies limited memory Longer time to implement Shorter time to implement Data Warehouse Sales Data Marketing Data Operations Data Data Mart 1 Data Mart 2 Data Mart 3
  • 17.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Types Of Data Mart 1. Dependent Data Mart ➢ The data is first extracted from the OLTP systems and then populated in the central DWH ➢ From the DWH, the data travels to the Data Mart 2. Independent Data Mart ➢ The data is directly received from the source system ➢ This is suitable for small organizations or smaller groups within an organization 3. Hybrid Data Mart ➢ The data is fed both from OLTP systems as well as the Data Warehouse OLTP Source Data Warehouse Data Mart OLTP Source Data Mart OLTP Source Data Warehouse Data Mart
  • 18.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Metadata ➢ Metadata is defined as data about data. ➢ Metadata in a DWH defines the source data i.e. Flat File, Relational Database and other objects. ➢ Metadata is used to define which table is source and target, and which concept is used to build business logic called transformation to the actual output.
  • 19.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Data Warehouse Architecture Now let’s understand the architecture Data Warehousing is based on.
  • 20.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Data Warehouse Architecture
  • 21.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Demo:- Populating A Data Warehouse By using Talend, let’s see how we can import data from various data sources and create a DWH
  • 22.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Hands-On:- Problem Statement: As a retail organization, you have details of 10,000 customer and 50,000 transactions. With this data you wish to find out Customers who have low number of purchases. Data Warehouse
  • 23.
    Copyright © 2017,edureka and/or its affiliates. All rights reserved. Session In A Minute Why Business Intelligence ? What is Business Intelligence? What Is Data Warehousing? Key terminologies in DWH Data Warehousing architecture Hands-On