View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
“ Business Intelligence (BI) is the process of transforming data into
information , information into knowledge and through iterative discoveries
turning knowledge into Intelligence .”
— Gartner group
Objective of Business Intelligence Value Volume BI can be defined as taking ‘Decisions based on Data’. The objective of BI is to transform large volumes of data into useful information. Intelligence Knowledge Information Data
OLTP systems handle day-to-day transactions and operations of the business. They are
high performance, high throughput systems. They run mission critical applications.
OLTP systems store, update and retrieve Operational Data.
Operational Data is the data that runs the business.
Some of the Operational systems that we interact with are Net Banking system, Tax Accounting
system, Payroll package, Order-processing system, SAP, Airline reservation system etc.
Why OLTP systems are not suitable for analysis? Database design: Dimensional Database design: Normalized Data needs to be integrated Islands of operational systems Data required at summary level Data stored at transaction level Historical information to analyze Supports day-to-day operations Analytical Reporting OLTP
OLTP Versus Data Warehouse Large to very large, Few GB to TB Small to large Few MB to GB Size Subject, time Application Data Organization Snapshots over time (Quarter, Month, etc). Historical 30 – 60 days or 1 year - 2 years. Current Age of Data Primarily Read only Data goes out DML Data goes in Operations Seconds to hours Sub seconds to seconds Response Time Data Warehouse OLTP Property
OLTP Versus Data Warehouse De-Normalized, Star schema Normalized Database Design Thousands to millions of records One record at a time No. of records Atomic and/or Summarized (aggregate), less granularity Atomic (Detail), transactional level, Highest granularity Grain Analysis Processes Activities Operational, Internal, External Operational, Internal Data Sources Data Warehouse OLTP Property
A data mart is designed for a single line of business (LOB) or functional area
such as sales, finance, or marketing.
Data Warehouses Versus Data Marts Bottom-up Top-Down Approach Data Warehouse Data Mart Next level of migration Lower Higher Initial effort, cost, Risk < 100 GB 100 GB to > 1 TB Size Months Months to years Implementation time Few Many Data Source Single-subject, LOB Multiple Subjects Department Enterprise Scope Data Mart Data Warehouse Property
The hybrid approach tries to blend the best of both
“ top-down and “bottom-up” approaches
Starts by designing DW and DM models synchronously, Build out first 2-3 DMs that are mutually exclusive and critical Backfill a DW behind the DMs Build the enterprise model and move atomic data to the DW
Federated Approach This approach is referred to as “an architecture of architectures”. Emphasizes the need to integrate new and existing heterogeneous BI environments.
Data Warehouse Components Source Systems Staging Area Presentation Area Access Tools Operational External Legacy Metadata Repository Data Marts Data Warehouse ODS
Source Systems Staging Area Presentation Area Access Tools Operational External Legacy Metadata Repository Data Marts Data Warehouse Data Warehouse Components ODS
“ Effective data extract, transform and load (ETL) processes represent the number one success factor for your data warehouse project and can absorb up to 70 percent of the time spent on a typical data warehousing project.”
Remote Staging Model Load Load Data staging area within the warehouse environment Data staging area in its own independent environment Extract Extract Transform Staging area Transform Staging area Warehouse Warehouse Operational system Operational system
181 North Street, Key West, FLA Oracle Corp UK Ltd 90345672 15 Main Road, Ft. Lauderdale, FLA Oracle Corp. UK 90234889 15 Main Road, Ft. Lauderdale Oracle Computing 90233489 100 N.E. 1st St. Oracle Limited 90233479 ADDRESS NAME CUSNUM
Mr. J. Smith,100 Main St., Bigtown, County Luth, 23565 Database 1 M300 HARRY H. ENFIELD N100 DIANNE ZIEFELD LOCATION NAME Database 2 300 ENFIELD, HARRY H 100 ZIEFELD, DIANNE LOCATION NAME 23565 Code County Luth Country Bigtown Town 100 Main St. Street Mr. J. Smith Name
Design improves performance by reducing table joins.
The model is easy for users to understand.
Supports multidimensional analysis.
Provides an extensible design
Primary keys represent a dimension.
Non-foreign key columns are values.
Facts are usually highly normalized.
Dimensions are completely de-normalized.
End users can express complex queries.
Base and Derived Data Payroll table Derived data Base data Emp_FK Month_FK Salary Comm Comp 101 05 1,000 0 1,000 102 05 1,500 100 1,600 103 05 1,000 200 1,200 104 05 1,500 1,000 2,500
Translating Business Measures into a Fact Table Business measures Facts Business Measures Number of Items Amount Cost Profit Fact Number of Items Item Amount Item Cost Profit Base Base Base Derived
Create groups of values for attributes with many unique values, such as income ranges and age brackets
Minimize the need for full table scans by pre-aggregating data
Bracketing Dimensions Customer_PK Bracket_FK Bracket_PK Customer_PK Bracket_FK Bracket dimension Customer dimension Income fact Bracket_PK Income (10Ks) Marital Status Gender Age 1 60-90 Single Male <21 2 60-90 Single Male 21-35 3 60-90 Single Male 35-55 4 60-90 Single Male >55 5 60-90 Single Female <21 6 60-90 Single Female 21-35
Identifying Analytical Hierarchies Store dimension Store ID Store Desc Location Size Type District ID District Desc Region ID Region Desc Business hierarchies describe organizational structure and logical parent-child relationships within the data. Region District Store Organization hierarchy
Multiple Hierarchies Store ID Store Desc Location Size Type District ID District Desc Region ID Region Desc City ID City Desc County ID County Desc State ID State Desc Region District Store Organization hierarchy Store dimension Region District Store Geography hierarchy
Multiple Time Hierarchies Fiscal year Fiscal quarter Fiscal month Fiscal time hierarchy Fiscal week Calendar year Calendar quarter Calendar month Calendar time hierarchy Calendar week
Drilling Up and Drilling Down Store 5 Store 1 Store 2 Region 2 District 2 District 4 Store 4 Group Market Hierarchy Region 1 District 1 Store 6 Store 3 District 3
Drilling Across Region District Stores > 20,000 sq. ft. Group Market hierarchy Region District Store Store City City City hierarchy