Business Intelligence:  A Review Prof. Swanand Deodhar [email_address]
Business Intelligence An information system that can be used to  analyze large datasets Such analyses help in  decision making  because Data relates directly with the business Data provides objective basis for decision making Numbers do not lie! Common sectors where business intelligence is being employed Organized retail (for example, market basket analysis) BFSI (for example, credit ratings) IT Security (for example, detecting network intrusion) Online marketing (for example, assessing consumers’ opinions)
Data Warehouse Business Intelligence exercise involves two parts Data Warehouse Data Mining Data warehouse compiles the data from multiple sources and transforms it into a uniform collection The uniformity is in the structure of the data (for example, nomenclature of columns)
Data Mining Data mining involves  analyzing the data collected in data warehouse Business applications involve Forecasting Segmentation Market basket analysis Intrusion detection (IT Security related)
Business Intelligence Process  Source  Systems Data Warehouse Cubes ETL Analytics Data Mining metadata
Data Warehouse Data Warehouse has two kinds of tables Facts Dimensions Facts These tables record actual data Most columns in fact tables can be subjected to mathematical operations For example: actual sales, share prices, commodity prices, employee salaries etc.  Dimensions  These tables provide an  lens  for examining the facts These tables include descriptive and categorical data A single table may not reflect multiple dimensions For example: product dimensions, time dimension, demographic dimension
Developing Data Warehouse Developing data warehouse is a tedious task Common sub-tasks include Extracting data from source systems Defining facts and dimensions Transforming data; leading to creation of facts and dimensions  (most cumbersome and costly) Relating facts and dimensions tables Loading data into the data warehouse
Developing Data Warehouse Depending on how facts and dimensions are created and related, data warehouse takes different look and feel Criteria Nomenclature 1 table for a dimension Star Schema More than one table for a dimension Snowflake Schema
Star Schema Customer (Dimension Table) Supplier (Dimension  Table) Product (Dimension Table) Geography (Dimension Table) Time (Dimension Table) Sales  (Fact  Table)
Snowflake Schema FactSales DimProd DimProdCat DimBrand
How to Decide on DW Structure? The choice is  Performance versus Robustness  Star Schema involves  one table per dimension Snowflake Schema involves  multiple tables per dimension Trend in sales in Nike products over last year Fact:  Sales Dimensions in case of star schema:  DimProd, DimTime Dimensions in case of Snowflake schema:  DimProd, DimBrand, DimProdCat DimProd  would be split into  DimProdCat  and  DimBrand
How to Decide on DW Structure? Star schema gives better performance because reporting needs to be done from fewer tables. Snowflake schema provide better data management because data is structured  normally .
OLAP Operations On a cube, following types of operations can be performed: Roll up Aggregate data for a given conditions Drill down Dig deeper into a cube for given conditions Slice and dice Cut the data for given condition Pivot Orient data along a certain attribute
Additional Concepts Meta-data of Data Warehouse Useful in describing the structure of different components of data warehouse (e.g. cubes, data marts, facts, dimensions etc.) Usually contains following Description of the structure of data warehouse Operational metadata (when was the last data migration done? What is performance of the warehouse systems?) Summary generation (how the data was aggregated and summarized? What reports are generated?) Mapping (what were the source systems? what transformations were executed on the data? How the access is being managed? Business terminology (what are the important business terms? How are they defined?
Additional Concepts Data Marts Subsets of data warehouse For users belonging a specific domain (marketing data mart, finance data mart etc.)
 

Business Intelligence: A Review

  • 1.
    Business Intelligence: A Review Prof. Swanand Deodhar [email_address]
  • 2.
    Business Intelligence Aninformation system that can be used to analyze large datasets Such analyses help in decision making because Data relates directly with the business Data provides objective basis for decision making Numbers do not lie! Common sectors where business intelligence is being employed Organized retail (for example, market basket analysis) BFSI (for example, credit ratings) IT Security (for example, detecting network intrusion) Online marketing (for example, assessing consumers’ opinions)
  • 3.
    Data Warehouse BusinessIntelligence exercise involves two parts Data Warehouse Data Mining Data warehouse compiles the data from multiple sources and transforms it into a uniform collection The uniformity is in the structure of the data (for example, nomenclature of columns)
  • 4.
    Data Mining Datamining involves analyzing the data collected in data warehouse Business applications involve Forecasting Segmentation Market basket analysis Intrusion detection (IT Security related)
  • 5.
    Business Intelligence Process Source Systems Data Warehouse Cubes ETL Analytics Data Mining metadata
  • 6.
    Data Warehouse DataWarehouse has two kinds of tables Facts Dimensions Facts These tables record actual data Most columns in fact tables can be subjected to mathematical operations For example: actual sales, share prices, commodity prices, employee salaries etc. Dimensions These tables provide an lens for examining the facts These tables include descriptive and categorical data A single table may not reflect multiple dimensions For example: product dimensions, time dimension, demographic dimension
  • 7.
    Developing Data WarehouseDeveloping data warehouse is a tedious task Common sub-tasks include Extracting data from source systems Defining facts and dimensions Transforming data; leading to creation of facts and dimensions (most cumbersome and costly) Relating facts and dimensions tables Loading data into the data warehouse
  • 8.
    Developing Data WarehouseDepending on how facts and dimensions are created and related, data warehouse takes different look and feel Criteria Nomenclature 1 table for a dimension Star Schema More than one table for a dimension Snowflake Schema
  • 9.
    Star Schema Customer(Dimension Table) Supplier (Dimension Table) Product (Dimension Table) Geography (Dimension Table) Time (Dimension Table) Sales (Fact Table)
  • 10.
    Snowflake Schema FactSalesDimProd DimProdCat DimBrand
  • 11.
    How to Decideon DW Structure? The choice is Performance versus Robustness Star Schema involves one table per dimension Snowflake Schema involves multiple tables per dimension Trend in sales in Nike products over last year Fact: Sales Dimensions in case of star schema: DimProd, DimTime Dimensions in case of Snowflake schema: DimProd, DimBrand, DimProdCat DimProd would be split into DimProdCat and DimBrand
  • 12.
    How to Decideon DW Structure? Star schema gives better performance because reporting needs to be done from fewer tables. Snowflake schema provide better data management because data is structured normally .
  • 13.
    OLAP Operations Ona cube, following types of operations can be performed: Roll up Aggregate data for a given conditions Drill down Dig deeper into a cube for given conditions Slice and dice Cut the data for given condition Pivot Orient data along a certain attribute
  • 14.
    Additional Concepts Meta-dataof Data Warehouse Useful in describing the structure of different components of data warehouse (e.g. cubes, data marts, facts, dimensions etc.) Usually contains following Description of the structure of data warehouse Operational metadata (when was the last data migration done? What is performance of the warehouse systems?) Summary generation (how the data was aggregated and summarized? What reports are generated?) Mapping (what were the source systems? what transformations were executed on the data? How the access is being managed? Business terminology (what are the important business terms? How are they defined?
  • 15.
    Additional Concepts DataMarts Subsets of data warehouse For users belonging a specific domain (marketing data mart, finance data mart etc.)
  • 16.