By -   Shaik Yasir Ahmed
DataBase (DB) –
A place where the collection of records will be maintained in a structured format so that It
can be easily retrieved when ever required is known as a database .

One of the most popularly used database
model is the relational model. It was developed
by Edgar Codd in 1969.

Example :
How do you think the Organizations store
their employee and customer information?
they store it in a database.
where do you think the website maintains the
login information about their users?
they store it in a database.
ERP–
                                       ERP, which is an abbreviation for Enterprise
                                       Resource Planning, is principally an integration
                                       of business management practices and modern
                                       technology.
                                       ERP is a business tool that management uses to
                                       operate the business day-in and day-out.

OLTP–
OLTP, which is an abbreviation for Online Transaction
processing, handle real time transactions which inherently
have some special requirements. If your running a Bank, for
instance, you need to ensure that as people withdrawing
money from ATM’S they are properly and efficiently updating
the database also those transactions are properly effecting to
their Accounts.
Data, Data everywhere yet ...

     • I can’t find the data I need
        – data is scattered over the network

           • I can’t get the data I need
               •   need an expert to get the data
           • I can’t understand the data I found
               • available data poorly documented
           • I can’t use the data I found
               • results are unexpected
               • data needs to be transformed from one
                 form to other

                                                    6
What are the users saying...
•Data should be integrated across
the enterprise
•Summary data has a real value to
the organization
•Historical data holds the key to
understanding data over time
•What-if capabilities are required



                                     7
In What way I can Answer the above question with my
                   OLTP system...

     Is Data Warehousing is the Solution ?? YES




   Can I Improve my
  business using Data
    warehousing ??

     YES.. How ??



                                           8
Data warehouse helps any Business in Many Ways
            Let’s say A producer wants to know….

                              Which are our
                               Which are our
                               lowest/highest margin
                                lowest/highest margin
                              customers ?
                               customers ?
                                                        Who are my customers
                                                         Who are my customers
    What is the most                                    and what products
                                                         and what products
     What is the most
    effective distribution                              are they buying?
                                                         are they buying?
     effective distribution
    channel?
     channel?


What product prom-
 What product prom-                                       Which customers
                                                           Which customers
-otions have the biggest
 -otions have the biggest                                  are most likely to go
                                                            are most likely to go
impact on revenue?
 impact on revenue?                                       to the competition ?
                                                           to the competition ?
                              What impact will
                               What impact will
                              new products/services
                               new products/services
                              have on revenue
                               have on revenue
                              and margins?
                               and margins?
                                        9
DWH – (Data Warehousing)
It usually contains historical data derived from transaction data, but it can include data
from other sources. It separates analysis workload from transaction workload and enables
an organization to consolidate data from several sources.

Raugh kimball –
     In simplest terms Data Warehouse can be
defined as collection of Data marts.
     -Data marts : Subjective collection of Data.
Bill Inmon –
     A data warehouse is a “subject-oriented,
integrated, time variant and nonvolatile” collection
of data in support of management’s decision-making
process.”
OLAP – (Online Analytical Processing)
The ability to analyze metrics in different dimensions such as time, geography, gender,
product, etc. For example, sales for the company is up. What region is most responsible for
this increase? Which store in this region is most responsible for the increase? What
particular product category or categories contributed the most to the increase? Answering
these types of questions in order means that you are performing an OLAP analysis.

OLAP servers provides better performance
for accessing multidimensional data. The
most important mechanism in OLAP which
allows it to achieve such performance is the
use of aggregations.

Aggregations are built from the fact table by
changing the granularity on specific
dimensions and aggregating up data along
these dimensions. 

OLAP systems gives analytical capabilities
that are not in SQL or are more difficult to
obtain.
1. OLTP (on-line transaction processing)      1. OLAP (on-line analytical processing)

2. Day-to-day operations: purchasing,         2. Data analysis and decision making
inventory, banking, manufacturing, payroll,
registration, accounting, etc.
3. The tables are in the Normalized form.     3. The tables are in the De-Normalized
                                                 form.
4. We Called the Storage objects as           4. We Called the Storage objects as
Tables. i.e., All the masters and the         Dimension and Facts. i.e., All the masters
Transactions are stored in the tables.        Are dimension and the Transactions are
                                              Facts.

5. For Designing OLTP we used data            5. For Designing OLAP we used
 modeling.                                    Dimension modeling.
                                              OLAP is classified into two i.e.,
                                              MOLAP & ROLAP
Normalized Tables               De-Normalized Tables
                                             Product_Dim
           Product
                                             Prod_Id
           Prod_Id
                                             Prod_Name
           Prod_Name
                                             Base_Rate
           Base_Rate        Category
                                             Cat_Name
           Cat_Id           Cat_Id           Cat_Desc
                            Cat_Name         Group_Name
Group                       Cat_Desc         Group_Desc
Group_Id                    Group_Id
Group_Name                             Topics Later We will Cover
Group_Desc                             1. Types of Dimensions
                                       2. Slowly changing Dimensions
                                       3. Hierarchies
SalesOrderDetails
                    SalesOrder_Fact
Cust_Id
                    Cust_Id                    Reference
SalesPerson
                    Prod_Id                    keys of
Prod_Id
                    Order_Date                 Dimensions
Order_Date
                    Delivery_Date
Booked_Date
                    Unit_Price                  Numeric
Delivery_Date                                   fields
                    Qty
Unit_Price                                      called as
                    Total_Amount                Fact or
Qty
                    Tax                         measure
Tax
Created_By          Qty*Unit_Price+Tax=Total Amount
                    Usually calculate all the calculations
                    before storing into OLAP
Prod_Dim                          Org_Dim
Prod_Id                           Org_Id
………             SalesOrder_Fact   ………
                Cust_Id
                Prod_Id
                Order_Date
                Delivery_Date
                Org_Id
                Unit_Price        Time_Dim
Cust_Dim        Qty               Date
Cust_Id         Total_Amount      Year
………             Tax               Month
                                  ………
           STAR Schema
Product_Dim   SalesOrder_Fact
Prod_Id       Cust_Id
Prod_Name     Prod_Id
Base_Rate     Order_Date
Cat_Name      Delivery_Date
Cat_Desc      Unit_Price
Group_Name    Qty
Group_Desc    Total_Amount
              Tax
1. Dimensions will have only   1. Dimension will have a
relation with the Fact.        relation other than Fact. (De-
(Normalized model)             Normalized model)
2. One to many or One to       2. Used for many to many
One relation will Occur.       relation.
3. Performance is fast but     3. Performance is Low but
required huge storage space.   required Less storage space.
A single, complete and
consistent store of data
obtained from a variety of
different sources made
available to end users in a what
they can understand and use in
a business context.

            [Barry Devlin]

                                   18
Data Warehousing --
    It is a process
    • Technique for assembling and
      managing data from various sources
      for the purpose of answering
      business questions. Thus making
      decisions that were not previous
      possible
    • A decision support database
      maintained separately from the
      organization’s operational database

                                   19
Also Data Mining works with
       Warehouse Data
                     Data Warehousing provides the
                       Enterprise with a memory




Data Mining provides the
  Enterprise with intelligence

                                            20
Oracle 10g
                                    IBM DB2




Base Product
               $ 25K    $ 40K       $ 25K
Tuning
                                $3K
                             Diagnostics
                                $3K
                             Partitioning   Performance
                               $10K            Expert
                (included)
                                               $10K

Manageability

Base Product
                $ 25K         $ 40K
                                56K          $ 25K
                                               35K
DB2 OLAP
                                           $35K
                                            DB2
                                         Warehouse
                              OLAP         $75K
                               $20k      Cube Views
                              Mining       $9.5K
                               $20k
                             BI Bundle
                               $20k
Business
Intelligence
                (included)

Manageability

Base Product
                $ 25K        $ 116K
                             $ 56K       $ $ 35K
                                           154.5K
Data Guard
                                   $116K       Recovery
                                                Expert
                                                 $10k

High Availability

Business
Intelligence
                    (included)

Manageability

Base Product
                    $ 25K        $ 116K
                                   232K       $ 154.5K
                                                164.5K
$116K -    $164.5K
                                  $232K




Multi-core

High Availability

Business
Intelligence
                    (included)

Manageability

Base Product
                    $ 25K        $$464k-
                                   232K
                                  $348k    $$164.5K
                                              329K
Data                          Reporting, OLAP,
   Analysis                         Data Mining




    Data
   Storage
                                        Repository

Data-Migration   Middleware (Populations-Tools)


Operational
Data Sources
What
                          happened?


 Why did
it happen?             What happened
                       why and how?



What will
happen?

                      Number of Users
 Additional Benefit
OLTP
                                     O   L          A       P
                                     ROLAP            MOLAP
                   Stage DB
                   Optional                          CUBE




                                             SSAS
                                                    Data Marts
                              SSIS


          SSIS                                           SSRS

       Integration Services                  Analysis            Reporting
                                             Services            Services
OLTP – Online Transaction Processing
OLAP – Online Analytical Processing
MOLAP – Multidimensional OLAP
ROLAP – Relational OLAP
HOLAP – Hybrid OALP
Dimensions – De-normalized master tables
Attributes – Columns of Dimensions
Hierarchies – sequential order of attributes
Facts (Measure group) – Transactions tables in DWH
Fact (Measures)
Cubes – Multidimensional storage of Data
KPI’s – Key performance indicator
Dashboards – combination of reports,kpis,charts
Data Marts – Subjective Collection of Data
SCD’s – Slowly changing Dimensions
Perspectives – Child Cube
Introduction To Msbi By Yasir
Introduction To Msbi By Yasir

Introduction To Msbi By Yasir

  • 1.
    By - Shaik Yasir Ahmed
  • 4.
    DataBase (DB) – Aplace where the collection of records will be maintained in a structured format so that It can be easily retrieved when ever required is known as a database . One of the most popularly used database model is the relational model. It was developed by Edgar Codd in 1969. Example : How do you think the Organizations store their employee and customer information? they store it in a database. where do you think the website maintains the login information about their users? they store it in a database.
  • 5.
    ERP– ERP, which is an abbreviation for Enterprise Resource Planning, is principally an integration of business management practices and modern technology. ERP is a business tool that management uses to operate the business day-in and day-out. OLTP– OLTP, which is an abbreviation for Online Transaction processing, handle real time transactions which inherently have some special requirements. If your running a Bank, for instance, you need to ensure that as people withdrawing money from ATM’S they are properly and efficiently updating the database also those transactions are properly effecting to their Accounts.
  • 6.
    Data, Data everywhereyet ... • I can’t find the data I need – data is scattered over the network • I can’t get the data I need • need an expert to get the data • I can’t understand the data I found • available data poorly documented • I can’t use the data I found • results are unexpected • data needs to be transformed from one form to other 6
  • 7.
    What are theusers saying... •Data should be integrated across the enterprise •Summary data has a real value to the organization •Historical data holds the key to understanding data over time •What-if capabilities are required 7
  • 8.
    In What wayI can Answer the above question with my OLTP system... Is Data Warehousing is the Solution ?? YES Can I Improve my business using Data warehousing ?? YES.. How ?? 8
  • 9.
    Data warehouse helpsany Business in Many Ways Let’s say A producer wants to know…. Which are our Which are our lowest/highest margin lowest/highest margin customers ? customers ? Who are my customers Who are my customers What is the most and what products and what products What is the most effective distribution are they buying? are they buying? effective distribution channel? channel? What product prom- What product prom- Which customers Which customers -otions have the biggest -otions have the biggest are most likely to go are most likely to go impact on revenue? impact on revenue? to the competition ? to the competition ? What impact will What impact will new products/services new products/services have on revenue have on revenue and margins? and margins? 9
  • 10.
    DWH – (DataWarehousing) It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. Raugh kimball – In simplest terms Data Warehouse can be defined as collection of Data marts. -Data marts : Subjective collection of Data. Bill Inmon – A data warehouse is a “subject-oriented, integrated, time variant and nonvolatile” collection of data in support of management’s decision-making process.”
  • 11.
    OLAP – (OnlineAnalytical Processing) The ability to analyze metrics in different dimensions such as time, geography, gender, product, etc. For example, sales for the company is up. What region is most responsible for this increase? Which store in this region is most responsible for the increase? What particular product category or categories contributed the most to the increase? Answering these types of questions in order means that you are performing an OLAP analysis. OLAP servers provides better performance for accessing multidimensional data. The most important mechanism in OLAP which allows it to achieve such performance is the use of aggregations. Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up data along these dimensions.  OLAP systems gives analytical capabilities that are not in SQL or are more difficult to obtain.
  • 12.
    1. OLTP (on-linetransaction processing) 1. OLAP (on-line analytical processing) 2. Day-to-day operations: purchasing, 2. Data analysis and decision making inventory, banking, manufacturing, payroll, registration, accounting, etc. 3. The tables are in the Normalized form. 3. The tables are in the De-Normalized form. 4. We Called the Storage objects as 4. We Called the Storage objects as Tables. i.e., All the masters and the Dimension and Facts. i.e., All the masters Transactions are stored in the tables. Are dimension and the Transactions are Facts. 5. For Designing OLTP we used data 5. For Designing OLAP we used modeling. Dimension modeling. OLAP is classified into two i.e., MOLAP & ROLAP
  • 13.
    Normalized Tables De-Normalized Tables Product_Dim Product Prod_Id Prod_Id Prod_Name Prod_Name Base_Rate Base_Rate Category Cat_Name Cat_Id Cat_Id Cat_Desc Cat_Name Group_Name Group Cat_Desc Group_Desc Group_Id Group_Id Group_Name Topics Later We will Cover Group_Desc 1. Types of Dimensions 2. Slowly changing Dimensions 3. Hierarchies
  • 14.
    SalesOrderDetails SalesOrder_Fact Cust_Id Cust_Id Reference SalesPerson Prod_Id keys of Prod_Id Order_Date Dimensions Order_Date Delivery_Date Booked_Date Unit_Price Numeric Delivery_Date fields Qty Unit_Price called as Total_Amount Fact or Qty Tax measure Tax Created_By Qty*Unit_Price+Tax=Total Amount Usually calculate all the calculations before storing into OLAP
  • 15.
    Prod_Dim Org_Dim Prod_Id Org_Id ……… SalesOrder_Fact ……… Cust_Id Prod_Id Order_Date Delivery_Date Org_Id Unit_Price Time_Dim Cust_Dim Qty Date Cust_Id Total_Amount Year ……… Tax Month ……… STAR Schema
  • 16.
    Product_Dim SalesOrder_Fact Prod_Id Cust_Id Prod_Name Prod_Id Base_Rate Order_Date Cat_Name Delivery_Date Cat_Desc Unit_Price Group_Name Qty Group_Desc Total_Amount Tax
  • 17.
    1. Dimensions willhave only 1. Dimension will have a relation with the Fact. relation other than Fact. (De- (Normalized model) Normalized model) 2. One to many or One to 2. Used for many to many One relation will Occur. relation. 3. Performance is fast but 3. Performance is Low but required huge storage space. required Less storage space.
  • 18.
    A single, completeand consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context. [Barry Devlin] 18
  • 19.
    Data Warehousing -- It is a process • Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible • A decision support database maintained separately from the organization’s operational database 19
  • 20.
    Also Data Miningworks with Warehouse Data Data Warehousing provides the Enterprise with a memory Data Mining provides the Enterprise with intelligence 20
  • 23.
    Oracle 10g IBM DB2 Base Product $ 25K $ 40K $ 25K
  • 24.
    Tuning $3K Diagnostics $3K Partitioning Performance $10K Expert (included) $10K Manageability Base Product $ 25K $ 40K 56K $ 25K 35K
  • 25.
    DB2 OLAP $35K DB2 Warehouse OLAP $75K $20k Cube Views Mining $9.5K $20k BI Bundle $20k Business Intelligence (included) Manageability Base Product $ 25K $ 116K $ 56K $ $ 35K 154.5K
  • 26.
    Data Guard $116K Recovery Expert $10k High Availability Business Intelligence (included) Manageability Base Product $ 25K $ 116K 232K $ 154.5K 164.5K
  • 27.
    $116K - $164.5K $232K Multi-core High Availability Business Intelligence (included) Manageability Base Product $ 25K $$464k- 232K $348k $$164.5K 329K
  • 28.
    Data Reporting, OLAP, Analysis Data Mining Data Storage Repository Data-Migration Middleware (Populations-Tools) Operational Data Sources
  • 29.
    What happened? Why did it happen? What happened why and how? What will happen? Number of Users Additional Benefit
  • 30.
    OLTP O L A P ROLAP MOLAP Stage DB Optional CUBE SSAS Data Marts SSIS SSIS SSRS Integration Services Analysis Reporting Services Services
  • 31.
    OLTP – OnlineTransaction Processing OLAP – Online Analytical Processing MOLAP – Multidimensional OLAP ROLAP – Relational OLAP HOLAP – Hybrid OALP Dimensions – De-normalized master tables Attributes – Columns of Dimensions Hierarchies – sequential order of attributes Facts (Measure group) – Transactions tables in DWH Fact (Measures) Cubes – Multidimensional storage of Data KPI’s – Key performance indicator Dashboards – combination of reports,kpis,charts Data Marts – Subjective Collection of Data SCD’s – Slowly changing Dimensions Perspectives – Child Cube