Data warehousing guidelines for bi and BAM solutions

Shehap EL-Nagar
I am MVP,MCTS , MCITP SQL Server, I am DB consultant and Architect for lots of Banking, Telecom ,Ministries and
governmental organizations all over Gulf ,also he has deep knowledge about T-SQL performance , HW Performance issues,
Data Warehousing solutions , SQL Server Replication, Clustering solutions and Database Designs for different kinds of
systems ...
The founder of the biggest SQL Server community all over the middle east http://sqlserver-performance-tuning.net/ , you
can watch its success memories at http://www.youtube.com/user/ShehapElNagar
Moderator and author at http://www.sql-server-performance.com ,
, the 1st SQL Server Author at MSDN Arabia http://msdn.microsoft.com/ar-sa/library/jj149119.aspx
, Speaker at SQL Saturday Events worldwide , local events at Saudi Arabia , many online events , more than 90 video
tutorials and also many private sessions for .net developers and Database Administrators
And also influent participator at Microsoft Forums of SQL Server at http://social.technet.microsoft.com.
More about him , you can find him on MVP Microsoft site http://mvp.microsoft.com/en-us/mvp/Shehap%20El-Nagar5000188 .
You can contact him at the below contacts :Mail :idgdirector@yahoo.com ….Cellular phone :00966560700733

Agenda and Overview:
First :Definitions and benefits of DWH
•Definition of Data Warehousing Solutions
•Benefits of DWH Solutions
•Why RTDWH (Real Time Data Warehousing

) is high necessary..?

•Data Warehouse

vs. Data Mart
•Relational DB vs. Dimensional DB
•Dimensional Database vs. Multidimensional Database
•Star Schema vs. Snow flake schema
•Techniques of DWH solutions

Second : RTDWH for online Reporting
•Technique and concepts

•Demo

Third :DWH for online Archiving
•Demo

Fourth :DWH for online ETL
•Demo

First :Definitions and Benefits of DWH

Definition of Data Warehousing
Relational
Database 1

Optimized Loader
Relational
Database 2

Data
Cleansing
Data Warehouse
Engine

Relational
Database 3

Relational
Database 4

De-normalize
Data

Metadata Repository
6

Benefits of Data warehousing:
Data Consolidation & organization
Data standardization for different attributes such as Collation
Support numerous RDBM sources flexibly like SQL Server , Oracle ,

TeraData , Informix , SAP BI , Sybase, Access , CSV files , Excel…etc
Scale up reports either SSRS or SSAS reports (OLAP Reports)
Speed up reports performance

Why Real Time Data
Warehousing..?
 Active

decision support

 Business

activity monitoring (BAM)

 Alerting

 Efficiently

execute business strategy

Relational DB vs. Dimensional DB:
Relational DB represents a normalized DB for OLTP transactions
purposes.
More normalization >>>Less no of columns >>> less possibility of
indexes >>> Less IO cost of cluster indexes while using them for
insert /update /delete of OLTP transactions
Dimensional DB represents a de-normalized DB for OLAP purposes
More number of interrelated columns in one table >>> Less
possibility for joins >>> More covering compound indexes

Data Warehouse vs. Data Mart:
•Data warehouse is a global repository for wide scale of

business
•Data mart is a smaller repository for specific business

scope
Therefore, we could say a Data Mart solution is sub set of
a bigger Data warehousing solution

Data Warehouse vs. Data Mart
Data Sources

Data Marts

Data Warehouse

11

Dimensional Database vs. Multidimensional Database:
Dimensional DBs could be used as staging DBs for SSAS reports or they could
be used directly for SSRS reports
Multidimensional DB represent SSAS DBs composed of cubes which are
formed basically of :
•Facts

tables which Contain business process core where aggregative columns
called measures could be found there.
•Dimension

tables which Contain Lookup details relevant to these aggregative

data
(DWH DB)
Dimensional DB

DB Service

(OLAP DB)
Multi-Dimensional DB

OLAP Service

Decision Support Client

Presentation Layer

Star Schema vs. Snow flake schema :
Snow flake schema close much the design of star schema
design but the first one is trying to break down schema
design more into smaller tables to avoid more redundancy of
columns.
Snow flake schema isn’t recommended for neither OLAP
transaction nor OLTP transaction

Star Schema

Telephone

date, custno, prodno, cityname,Region ...

Name

Fact Table
Gender

Marital status

Snow flake Schema
Telephone

date, custno, prodno, cityname,Region ...

Name

Fact Table
Gender

Marital status

Gender
lookup table

Marital status
lookup table

Data warehousing techniques
•
•Old

2005 codes (Select /insert/Update /Delete)

•New

2008 codes “Merge” which could replace more
efficiently all of above commands in one statement
•DTS

(Data transformation Service) and SSIS Packages

•Enterprise

platform solutions for LDWH(Large DWH)

Enterprise Platform Solutions
Fast Data tracking solution
Sybase IQ
Red Brick Warehouse
IBM
DB2 MVS
Universal Server
IBM Data Warehousing
Teradata
Informix
Online Dynamic Server
XPS --Extended Parallel Server
Universal Server for object relational applications

Second :RTDWH for online Reporting

Technique of DWH Solutions used for Online Reporting
•Creating

2 tables (One Temp table and the second is DWH table itself)

•
•Making
•Then

all DML transactions on a Temp table.

compare Temp table results with DWH table.

•If

not match for any record /column, then Bulk Merge command from
Temp table to DWH Table
•You

can use now this DWH Table for your online Reports

Concepts of DWH Solution used for Online Reporting
1- Set xact_abort on : To ensure the highest transactional status for group of
DML transactions to commit all if all succeeded and rollback all if any of them failed
2- Set nocount on :To speed up queries by avoiding counting no of records each
time of run
3- Set deadlock_priority low; To avoid any impact on end users transactions
while this online data warehousing.

4- Try /Catch commands : To capture any possible errors and report them by
mail.
5- Bulk Logged mode :To save efficiently more storage capacity while bulk
merge
6-Using Read committed snapshot isolation level using row versioning is
recommended to avoid heavy locks/deadlcoksd

Third :Data warehousing
For online Archiving solution

Techniques of DWH Solutions for Archiving
•Bulk

insert the old data from a Source table to an Archived table

•Bulk

delete from source table after success of 1st step

•Bulk

delete should be split into smaller patches with small
no of records like 1000 delay of 5-30 sec between each patch and
another to avoid any tangible locks or deadlocks

Concepts of DWH Solutions used for Archiving
1 - Bulk Logged mode :To save efficiently more storage capacity while bulk
merge as we are going more to show by next workshops
2- Use WAITFOR DELAY '00:00:30'doesn’t mark for risky waits here, but just a
normal wait command like service broker wait.
3- Bulk Insert and bulk delete phases could be conducted in different
transactions in different time intervals without any risks
4- You could validate that also using output commands

Fourth :Data warehousing
For online ETL solutions

Technique of DWH Solutions used for ETL

Run your ETL process in parallel with end users activities
but to a different table rather than online tables
Once finish, start to scan all mismatches between the 2
tables through the 3 data warehousing statements

Concepts of DWH Solutions of DWH used for ETL
Scanning any new inserted data entity within the source tables to be inserted to
the target tables
Scanning any updated data entity through scanning any records shared between
the 2 tables for PK values but different for any other columns
.
Scanning any deleted data through using except commands
•The

3 phases could be undertaken asynchronously without any risk at all

Q&A
Post your questions at:

http://www.sqlserver-performance-tuning.net/forums/

Thank you ..See you again

1/23/2014

Data warehousing guidelines for bi and BAM solutions

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data warehousing guidelines for bi and BAM solutions

Similar to Data warehousing guidelines for bi and BAM solutions (20)

Recently uploaded

Recently uploaded (20)

Data warehousing guidelines for bi and BAM solutions