BI ARCHITECTURE
What is asserted without proof can be denied without proof.
(Euclide)
BI Architecture – March 2013 - Author: Thierry de Spirlet
1
 Why Architecture
 Architecture will permit to organize systems and information
 Incorporate Best Practices
 Defines Hardware, Software and Environmental components that are
needed to build end-to-end solutions to help meet specific business
needs
 Identify Building blocks
 Spans all industries and all solution Areas
 Provides a common language and facilitates collaboration
 BI Reference Architecture is a framework for developing BI solutions.
 BI solutions will not exist if there are no business interrogations.
BI Architecture – March 2013 - Author: Thierry de Spirlet
2
 BI Architecture components
 Models
 Processes
 Scheduling
 Monitoring
 Project organization
BI Architecture – March 2013 - Author: Thierry de Spirlet
3
 Models
 A model is an abstraction and reflection of the real world.
 Modeling gives us the ability to visualize what we cannot yet realize.
 Several forms of models exist:
 Data Model to organize data
 Business models to organize business activities
 Process models to organize interactions
 The primary aim of a data model is to make sure that all data
objects required by the business are accurately and fully
represented.
BI Architecture – March 2013 - Author: Thierry de Spirlet
4
 Models used in BI solutions
 OLTP Models:
 Model optimized to support OLTP Application using in production area; these models must
support operations and describes the tables, columns, and keys of a database that stores
operational data.
 E/R model:
 this is an Entity - Relation diagram; this diagram will represent real entities with all of their relations;
this diagram should be 3NF
 Dimensional Diagram:
 this model will represent information by using facts and dimensions with the lowest level of
granularity
 Datamart modeling:
 this model is a model permitting data access optimization and presentation from dimensional
models
 Data Vault model: Data Vault model is a particular approach to structure an EDW
BI Architecture – March 2013 - Author: Thierry de Spirlet
5
 BI Layers
 BI Layers are present to organize data in the best way for a
particular usage; these presentations will also prepare data in the
best way and prepare information for the next layer
BI Architecture – March 2013 - Author: Thierry de Spirlet
6
 BI Processes
 Bi processes are all of these processes used to load, transform and
load data;
 This will include also a complete process premitting:
 Data Optimization
 Data Correction
 Business Process permitting to data process models such as:
 Costing models
 Pricing models
 Operational models
BI Architecture – March 2013 - Author: Thierry de Spirlet
7
Source
Systems
•Databases
•Flat Files
•Web Services
•…
 Various
models
Extract Area
Conceptualisation
Area
•Integrate concepts
• ERD Model
Data
Warehouse
•Organize facts &
dimensions
• Dimensional
Model or DV
Model
Datamarts
•Optimize data
Accesses
• Dimensional
model with
Aggregation
Abstraction
Layer
•presentation
model
Exploitation
area
Cleasing area
LT Corporate
Storage Area
Business Business
Business Enrichment Processes
Layer:
•Large Enterprises
•Medium Enterprises
•Small Enterprises
Layer:
•Large Enterprises
•Medium Enterprises
•Small Enterprises
Layer:
•Large Enterprise
•Medium Enterprises
•Small Enterprisesa
Layer:
•Large Enterprises
•Medium Enterprises
•Small Enterpries
Layer:
•Large Enterprise
•Medium Enterpries
•Small Enterpries
Layer:
•Large Enterprise
•Medium Enterprises
•Small Enterprises
BI Architecture – March 2013 - Author: Thierry de Spirlet
Layer:
•Large Enterprises
•Medium Enterprises
•Small Enterprises
= Optional
BI Processes
8
Extract Area Conceptualisation
Area
Data Warehouse Datamarts
BI Architecture – March 2013 - Author: Thierry de Spirlet
• Pure extraction
• Basic
Transformation
• Output area
•Working Areas
•Cleasing &
DQ
•Output area
•Working Areas
•Cleasing &
DQ
•Output area
•Working Areas
•Cleasing &
DQ
•Output area
9
 Explaining Model content
 OLTP Models:
 Data is organized to optimize transactions; various models exists
 ERD Models:
 Data is organized around conceptual entities;
 achieve processing and data storage efficiency by reducing data redundancy (storing data elements once)
 provide flexibility and ease of maintenance
 protect the integrity of data by storing it once
 If existing, must Integrate natural key substitution; 3FN; default/dummy values
 Dimensional models:
 Data are organized around concept of facts and dimensions;
 if not yet done, must integrate natural key substitution; 3FN; default/dummy values; surrogating
dimensions (never surrogating in an ERD Model)
 Presentation models:
 Data are organized to optimize exploitation of organized data
BI Architecture – March 2013 - Author: Thierry de Spirlet
10
 Explaining Layers
 Extract area: place where all data will be initially loaded; permit to reduce stress on source systems
 Conceptualisation Area: Place where data is organized around conceptual entities and their relations;
 achieve processing and data storage efficiency by reducing data redundancy (storing data elements
once)
 provide flexibility and ease of maintenance
 protect the integrity of data by storing it once
 Implement basic data rules on data (Caps, Trim, …)
 Implement business rules rules on data
 Datawarehouse: Place where data are organized around concept of facts and dimensions for the
enterprise, calculations and transformations are done at the lowest granularity level (if multi-dimension
model)
 Implement basic data rules on data (Caps, Trim, …) if no conceptual level
 Implement business rules rules on data if no conceptual level
 Implement classical datawarehouse concepts: facts & dimensions
 Datamarts: Place where data are stored to optimize their final processing; can limit set of data used
BI Architecture – March 2013 - Author: Thierry de Spirlet
11
 Risks associated to ERD Models
 End users cannot understand or remember an ERD model.
 End users cannot navigate an ERD model.
 There is no graphical user interface (GUI) that takes a general ER model and
makes it usable by end users.
 Software cannot usefully query a general ERD model:
 Cost-based optimizers that attempt to do this are notorious for making the wrong
choices, with disastrous consequences for performance.
 Use of the ERD modeling technique defeats the basic allure of data warehousing,
namely intuitive and high-performance retrieval of data.
 ERD Models are time-consuming
 while building this level correspond to a conceptual reverse-engineering of source
applications and highly coupled with business concepts.
BI Architecture – March 2013 - Author: Thierry de Spirlet
12
 Risks associated to Data Vault Models
 Not appropriate to End users
 End users cannot navigate a Data Vault model.
 Software cannot usefully query a general Data Vault model due to
the numerous present tables
 Data Vault Models may be time-consuming
BI Architecture – March 2013 - Author: Thierry de Spirlet
13
 Techniques used for Models
 OLTP Models:
 Denormalization
 ERD Models:
 Normalization; natural key substitution; 3NF; historization without update
propagation (see later)
 Dimensional models:
 Facts with business logic (e.g. distribution, ventilation, aggregation, …),
dimensions (with associated techniques such as SCDx, surrogating
dimensions); mini dimensions;…
 Presentation models:
 Data are organized to optimize exploitation of organized data
BI Architecture – March 2013 - Author: Thierry de Spirlet
It is essential to associate the right model to the right layer
14
 In summary, choosing the right BI landscape is essential since the
beginning
 Implementing the right model at the right place is mandatory
 Revamping an existing BI landscape is extremely cost and time
consuming, it is fundamental to well design it from the beginning.
 Architecture will define how to do things and should be
customisable for different situations.
BI Architecture – March 2013 - Author: Thierry de Spirlet
IN CONCLUSION
15

BI architecture presentation and involved models (short)

  • 1.
    BI ARCHITECTURE What isasserted without proof can be denied without proof. (Euclide) BI Architecture – March 2013 - Author: Thierry de Spirlet 1
  • 2.
     Why Architecture Architecture will permit to organize systems and information  Incorporate Best Practices  Defines Hardware, Software and Environmental components that are needed to build end-to-end solutions to help meet specific business needs  Identify Building blocks  Spans all industries and all solution Areas  Provides a common language and facilitates collaboration  BI Reference Architecture is a framework for developing BI solutions.  BI solutions will not exist if there are no business interrogations. BI Architecture – March 2013 - Author: Thierry de Spirlet 2
  • 3.
     BI Architecturecomponents  Models  Processes  Scheduling  Monitoring  Project organization BI Architecture – March 2013 - Author: Thierry de Spirlet 3
  • 4.
     Models  Amodel is an abstraction and reflection of the real world.  Modeling gives us the ability to visualize what we cannot yet realize.  Several forms of models exist:  Data Model to organize data  Business models to organize business activities  Process models to organize interactions  The primary aim of a data model is to make sure that all data objects required by the business are accurately and fully represented. BI Architecture – March 2013 - Author: Thierry de Spirlet 4
  • 5.
     Models usedin BI solutions  OLTP Models:  Model optimized to support OLTP Application using in production area; these models must support operations and describes the tables, columns, and keys of a database that stores operational data.  E/R model:  this is an Entity - Relation diagram; this diagram will represent real entities with all of their relations; this diagram should be 3NF  Dimensional Diagram:  this model will represent information by using facts and dimensions with the lowest level of granularity  Datamart modeling:  this model is a model permitting data access optimization and presentation from dimensional models  Data Vault model: Data Vault model is a particular approach to structure an EDW BI Architecture – March 2013 - Author: Thierry de Spirlet 5
  • 6.
     BI Layers BI Layers are present to organize data in the best way for a particular usage; these presentations will also prepare data in the best way and prepare information for the next layer BI Architecture – March 2013 - Author: Thierry de Spirlet 6
  • 7.
     BI Processes Bi processes are all of these processes used to load, transform and load data;  This will include also a complete process premitting:  Data Optimization  Data Correction  Business Process permitting to data process models such as:  Costing models  Pricing models  Operational models BI Architecture – March 2013 - Author: Thierry de Spirlet 7
  • 8.
    Source Systems •Databases •Flat Files •Web Services •… Various models Extract Area Conceptualisation Area •Integrate concepts • ERD Model Data Warehouse •Organize facts & dimensions • Dimensional Model or DV Model Datamarts •Optimize data Accesses • Dimensional model with Aggregation Abstraction Layer •presentation model Exploitation area Cleasing area LT Corporate Storage Area Business Business Business Enrichment Processes Layer: •Large Enterprises •Medium Enterprises •Small Enterprises Layer: •Large Enterprises •Medium Enterprises •Small Enterprises Layer: •Large Enterprise •Medium Enterprises •Small Enterprisesa Layer: •Large Enterprises •Medium Enterprises •Small Enterpries Layer: •Large Enterprise •Medium Enterpries •Small Enterpries Layer: •Large Enterprise •Medium Enterprises •Small Enterprises BI Architecture – March 2013 - Author: Thierry de Spirlet Layer: •Large Enterprises •Medium Enterprises •Small Enterprises = Optional BI Processes 8
  • 9.
    Extract Area Conceptualisation Area DataWarehouse Datamarts BI Architecture – March 2013 - Author: Thierry de Spirlet • Pure extraction • Basic Transformation • Output area •Working Areas •Cleasing & DQ •Output area •Working Areas •Cleasing & DQ •Output area •Working Areas •Cleasing & DQ •Output area 9
  • 10.
     Explaining Modelcontent  OLTP Models:  Data is organized to optimize transactions; various models exists  ERD Models:  Data is organized around conceptual entities;  achieve processing and data storage efficiency by reducing data redundancy (storing data elements once)  provide flexibility and ease of maintenance  protect the integrity of data by storing it once  If existing, must Integrate natural key substitution; 3FN; default/dummy values  Dimensional models:  Data are organized around concept of facts and dimensions;  if not yet done, must integrate natural key substitution; 3FN; default/dummy values; surrogating dimensions (never surrogating in an ERD Model)  Presentation models:  Data are organized to optimize exploitation of organized data BI Architecture – March 2013 - Author: Thierry de Spirlet 10
  • 11.
     Explaining Layers Extract area: place where all data will be initially loaded; permit to reduce stress on source systems  Conceptualisation Area: Place where data is organized around conceptual entities and their relations;  achieve processing and data storage efficiency by reducing data redundancy (storing data elements once)  provide flexibility and ease of maintenance  protect the integrity of data by storing it once  Implement basic data rules on data (Caps, Trim, …)  Implement business rules rules on data  Datawarehouse: Place where data are organized around concept of facts and dimensions for the enterprise, calculations and transformations are done at the lowest granularity level (if multi-dimension model)  Implement basic data rules on data (Caps, Trim, …) if no conceptual level  Implement business rules rules on data if no conceptual level  Implement classical datawarehouse concepts: facts & dimensions  Datamarts: Place where data are stored to optimize their final processing; can limit set of data used BI Architecture – March 2013 - Author: Thierry de Spirlet 11
  • 12.
     Risks associatedto ERD Models  End users cannot understand or remember an ERD model.  End users cannot navigate an ERD model.  There is no graphical user interface (GUI) that takes a general ER model and makes it usable by end users.  Software cannot usefully query a general ERD model:  Cost-based optimizers that attempt to do this are notorious for making the wrong choices, with disastrous consequences for performance.  Use of the ERD modeling technique defeats the basic allure of data warehousing, namely intuitive and high-performance retrieval of data.  ERD Models are time-consuming  while building this level correspond to a conceptual reverse-engineering of source applications and highly coupled with business concepts. BI Architecture – March 2013 - Author: Thierry de Spirlet 12
  • 13.
     Risks associatedto Data Vault Models  Not appropriate to End users  End users cannot navigate a Data Vault model.  Software cannot usefully query a general Data Vault model due to the numerous present tables  Data Vault Models may be time-consuming BI Architecture – March 2013 - Author: Thierry de Spirlet 13
  • 14.
     Techniques usedfor Models  OLTP Models:  Denormalization  ERD Models:  Normalization; natural key substitution; 3NF; historization without update propagation (see later)  Dimensional models:  Facts with business logic (e.g. distribution, ventilation, aggregation, …), dimensions (with associated techniques such as SCDx, surrogating dimensions); mini dimensions;…  Presentation models:  Data are organized to optimize exploitation of organized data BI Architecture – March 2013 - Author: Thierry de Spirlet It is essential to associate the right model to the right layer 14
  • 15.
     In summary,choosing the right BI landscape is essential since the beginning  Implementing the right model at the right place is mandatory  Revamping an existing BI landscape is extremely cost and time consuming, it is fundamental to well design it from the beginning.  Architecture will define how to do things and should be customisable for different situations. BI Architecture – March 2013 - Author: Thierry de Spirlet IN CONCLUSION 15