SlideShare a Scribd company logo
1 of 19
Download to read offline
Dimensional Fact Model
Stuttgart, 9/3/2016
Stefano Cazzella @StefanoCazzella
http://caccio.blogdns.net
http://bimodeler.com
stefano.cazzella{at}gmail.com
11BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella
My Professional Timeline
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 2
2001 2003 2005 2007 2009 2011 2013 2015
Master
degree in
Software
Engineering
Business
Intelligence
Specialist
Business
Consultant
Delivery Manager
Methodology
Industrialization of
the delivery phase
University of Rome
« La Sapienza »
Project
Manager
Datamat S.p.A.
a Finmeccanica
company
Sopra Steria Group
Consulting – IT Services – Software Solutions
BI Trends
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 3
Data
Integration
Descriptive
Predictive
Prescriptive
Deep
learning
Business
Value
Business
Intelligence
Data
Warehouse
Simulation &
forecasting
Optimization &
automation
Semantic &
AI
Time
Digital transformation of every market
Data explosion: exponential growth of digital data
Disruptive scenario
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 4
Innovative
technologies
•Internet of
Things
•Big Data
•Distributed
computing
•In Memory
systems
•Cloud
•Mobile
Complex
architectures
•Data
federation
•Data store
•No SQL
•Distributed
file system
•Appliances
•Real-time data
integration
Business
transformations
•Frenetic time-
to-market
•API / service
economy
•Data-driven
company
•Business
process
automation
… more … … more … … more …
Business
Design
Build
Business
Desing
Build
New processes ? Roles ?
Waterfall process Iterative process
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 5
Business
Analyst
Engineer
Technician
Data
Scientist
Business
Analyst
Engineer
Technician
Project Layers for Data Mart
Business
•Dimensional Fact Model
Design
•Relational model
Build
•DBMS specific DDL
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 6
Why Dimensional Fact Model ?
Formal language  well-specified syntax and an unequivocally
interpretation (semantic) based on a sound algebraic definition
Simple and effective graphical notation (representation)
Does not imply any technical/implementation choice
Specifically designed to represent multi-dimensional models
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 7
1
2
3
4
Multi-dimensional model
The SALES event:
On Nov. 25th, 2014
the Store 2 sold 10
pieces of Product X
for a total revenue of
€ 220
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 8
Product
Store
Day
Product X
Store 2
Store 1
Store 3
Product Y
Units sold: 10 pieces
Revenue: € 220
Product Z
3-dimensional SALES hyper-space
DFM Notation Compendium
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 9
Hierarchy
Dimension
Dimensonal attribute
Non-dimensonal
attribute
Measure
Fact schema SALES
Dependency
Data Mart building process
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 10
Business user’s needs
Model
transformation
Logical data model
(Relational model:
tables, columns, etc.)
Phisical data model
(DDL with indexes,
partions, etc.)
Model
transformation
Multidimensional
data model
(Dimensional Fact Model)
Requirements
definition
Data Mart
Deployment
Implementation
strategy
Technical knowledge
Data Mart building process
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 11
Business user’s needs
Model
transformation
Logical data model
(Relational model:
tables, columns, etc.)
Phisical data model
(DDL with indexes,
partions, etc.)
Model
transformation
Multidimensional
data model
(Dimensional Fact Model)
Requirements
definition
Data Mart
Deployment
Implementation
strategy
Technical knowledge
Business - From requisite to DFM
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 12
• Context: weblog analytics - the
analysis of the visits of several
web sites belonging to different
domains (eg. Google Analytics)
• Requisite: monitoring and
analyzing the number of visits
and their monthly and daily
average duration for each page
of the websites, or each domain,
distributed by the geographic
region of the IP of the visitors.
12
 Domain definition
 Aggregation rules
 Optional dependencies
+
Design choice
•Star-schema (denormalized dimension table)
•Snow-flake (hierarchies implemented by tables in 3NF)
Reference ROLAP model:
•Use natural key (the dimension attribute  PK column)
•Use surrogate key (add a new column with no business meaning)
•Use slow-changing dimension (SCD) of type 2
•Use implicit dimension (no dimension table, only a column in the fact table)
Hierarchy implementation strategy (for every dimension)
•Text  VARCHAR(250) ; Currency  NUMBER(9,2) ; etc.
Domain  Data type association
•Table name prefix (D for Dimensions, F for Facts) ; Number  NBR ; etc.
Standard naming conventions and abbreviations
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 13
Transform DFM in a Relational Model
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 14
Model
transformation
Fact grain
Technical design choices:
• Reference ROLAP model  star-schema
• Hierarchy Viewer use surrogate key
• Hierarchy Page  SCD – Type 2
• Hierarchy Time  denormalized with natural key
Surrogate key
SCD-2
Start date
End date
14
Build choice
•Microsoft SqlServer – Oracle DBMS – SAP Hana– Apache Hive / Hadoop
Choice the DBMS
•Generate unique keys / primary keys / integrity constraints (foreign keys)
Generate constraints?
•Add clustered indexes / column-store indexes / bitmap indexes / etc.
Add specific indexes
•Organize fact tables in partitions (by hash, value, range, etc.)
Define table partitions
•Define file groups / tablespaces for tables, partitions, indexes
Distribute data over multiple volumes
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 15
In-Memory Computing Engine
Session management
Request Processing / Execution Control
Transaction
Manager
Metadata
Manager
SQL Parser
SQL ScriptCalc. Engine
MDX
Relational Engines
Row Store Column Store
Persistence LayerPage Management Logger
Disk Storage
Authorization
Manager
Data Volumes Log Volumes
SAP HANA Architecture
Row tables
versus
Column tables
Partitioning by
HASH, RANGE,
ROUNDROBIN
Use extended
tables for
warm data
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 16
Phisical model and DDL
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 17
Implementation choices & best practice:
• DBMS  SAP HANA
• All tables are Column-tables
• Fact F_VISITS partitioned by HASH on DAY
• Fact F_VISITS indexed by PAGE
Partition by HASH
BTREE index
17
Unload priority for memory optimization
Create a column table
Preload columns for
performance optimization
BI Modeler
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 18
• In order to apply a model-driven approach, BI Project teams
need a software tool to:
Manage (draw) all the models - DFM, relational, etc.
Support (and drive) the model transformation process
• There was (are) no many tools able to do that so, in 2006 I
started working on the development of …
http://bimodeler.com
DEMO
BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 19
Create a
DFM from
scratch
Define the fact schema and its measures
Add some dimensions / hierarchies
Define and associate domains to attributes and measures
Transform a
DFM in a
relational
data model
Define an implementation strategy for Hierarchies
Associate Data type to domains
Apply a naming convention
Add physical
properties
to the
relational
model
Choose a DBMS
Create partitions
Create indexes
Generate DDL script

More Related Content

What's hot

BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application ComparisonScott Mitchell
 
Perficient Business Intelligence Analysis and Delivery Options in SharePoint
Perficient Business Intelligence Analysis and Delivery Options in SharePointPerficient Business Intelligence Analysis and Delivery Options in SharePoint
Perficient Business Intelligence Analysis and Delivery Options in SharePointPerficient, Inc.
 
SAP BW Reports - Copy
SAP BW Reports - CopySAP BW Reports - Copy
SAP BW Reports - CopyAby m
 
Sirius viewpoints and modernization paris 2015
Sirius viewpoints and modernization paris 2015Sirius viewpoints and modernization paris 2015
Sirius viewpoints and modernization paris 2015Adrian Kiener
 
Mac oct 18 2012 version 4
Mac oct 18 2012 version 4Mac oct 18 2012 version 4
Mac oct 18 2012 version 4Rose Bud
 
Business Intelligence Fundamentals
Business Intelligence FundamentalsBusiness Intelligence Fundamentals
Business Intelligence FundamentalsMikko_Valtonen
 
Data visualization
Data visualizationData visualization
Data visualizationSlava Kokaev
 
Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Mark Ginnebaugh
 
Питер Хартманн (Peter Hartmann)
Питер Хартманн (Peter Hartmann)Питер Хартманн (Peter Hartmann)
Питер Хартманн (Peter Hartmann)Rina Kizune
 
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...DialogMarketingDays
 
Perficient Self Service Business Intelligence with Power Pivot
Perficient Self Service Business Intelligence with Power PivotPerficient Self Service Business Intelligence with Power Pivot
Perficient Self Service Business Intelligence with Power PivotPerficient, Inc.
 
Sap bi training with bo integrations
Sap bi training with bo integrationsSap bi training with bo integrations
Sap bi training with bo integrationspjraosapbi
 
XMPro PivotGrid
XMPro PivotGridXMPro PivotGrid
XMPro PivotGridXMPRO
 
Gireesh_Updated_Resume
Gireesh_Updated_ResumeGireesh_Updated_Resume
Gireesh_Updated_Resumegireesh e
 

What's hot (19)

BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application Comparison
 
Perficient Business Intelligence Analysis and Delivery Options in SharePoint
Perficient Business Intelligence Analysis and Delivery Options in SharePointPerficient Business Intelligence Analysis and Delivery Options in SharePoint
Perficient Business Intelligence Analysis and Delivery Options in SharePoint
 
SAP BI Training
SAP BI TrainingSAP BI Training
SAP BI Training
 
SAP BW Reports - Copy
SAP BW Reports - CopySAP BW Reports - Copy
SAP BW Reports - Copy
 
Sirius viewpoints and modernization paris 2015
Sirius viewpoints and modernization paris 2015Sirius viewpoints and modernization paris 2015
Sirius viewpoints and modernization paris 2015
 
Sap Analytics Cloud
Sap Analytics CloudSap Analytics Cloud
Sap Analytics Cloud
 
Mac oct 18 2012 version 4
Mac oct 18 2012 version 4Mac oct 18 2012 version 4
Mac oct 18 2012 version 4
 
Business Intelligence Fundamentals
Business Intelligence FundamentalsBusiness Intelligence Fundamentals
Business Intelligence Fundamentals
 
Data visualization
Data visualizationData visualization
Data visualization
 
Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015Automating Microsoft Power BI Creations 2015
Automating Microsoft Power BI Creations 2015
 
BI Tools
BI Tools BI Tools
BI Tools
 
SAP BW Introduction.
SAP BW Introduction.SAP BW Introduction.
SAP BW Introduction.
 
Питер Хартманн (Peter Hartmann)
Питер Хартманн (Peter Hartmann)Питер Хартманн (Peter Hartmann)
Питер Хартманн (Peter Hartmann)
 
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...
Успешная ДМ компания: как компаниям дистанционной торговли правильно посчитат...
 
Perficient Self Service Business Intelligence with Power Pivot
Perficient Self Service Business Intelligence with Power PivotPerficient Self Service Business Intelligence with Power Pivot
Perficient Self Service Business Intelligence with Power Pivot
 
Sap bi training with bo integrations
Sap bi training with bo integrationsSap bi training with bo integrations
Sap bi training with bo integrations
 
XMPro PivotGrid
XMPro PivotGridXMPro PivotGrid
XMPro PivotGrid
 
Gireesh_Updated_Resume
Gireesh_Updated_ResumeGireesh_Updated_Resume
Gireesh_Updated_Resume
 
Microsoft Business Intelligence
Microsoft Business IntelligenceMicrosoft Business Intelligence
Microsoft Business Intelligence
 

Viewers also liked

L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...
L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...
L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...caccio
 
Assignment of Design Research Method (Chen Mengdie)
Assignment of Design Research Method (Chen Mengdie)Assignment of Design Research Method (Chen Mengdie)
Assignment of Design Research Method (Chen Mengdie)cocoachen1992
 
Database and Data Warehousing-Building Business Intelligence
Database and Data Warehousing-Building Business IntelligenceDatabase and Data Warehousing-Building Business Intelligence
Database and Data Warehousing-Building Business IntelligenceYeng Ferraris Portes
 
Basics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesBasics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesValmik Potbhare
 
MIS BAB 2 - Sistem Pengolahan Data
MIS BAB 2 - Sistem Pengolahan DataMIS BAB 2 - Sistem Pengolahan Data
MIS BAB 2 - Sistem Pengolahan DataRiza Nurman
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleSajjad Zaheer
 
Amazon interview questions
Amazon interview questionsAmazon interview questions
Amazon interview questionsSumit Arora
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyMark Ginnebaugh
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 

Viewers also liked (13)

L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...
L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...
L'approccio model-driven di Sopra Group per i progetti di Business Intelligen...
 
Assignment of Design Research Method (Chen Mengdie)
Assignment of Design Research Method (Chen Mengdie)Assignment of Design Research Method (Chen Mengdie)
Assignment of Design Research Method (Chen Mengdie)
 
Database and Data Warehousing-Building Business Intelligence
Database and Data Warehousing-Building Business IntelligenceDatabase and Data Warehousing-Building Business Intelligence
Database and Data Warehousing-Building Business Intelligence
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Basics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration TechniquesBasics of Microsoft Business Intelligence and Data Integration Techniques
Basics of Microsoft Business Intelligence and Data Integration Techniques
 
MIS BAB 2 - Sistem Pengolahan Data
MIS BAB 2 - Sistem Pengolahan DataMIS BAB 2 - Sistem Pengolahan Data
MIS BAB 2 - Sistem Pengolahan Data
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with Example
 
Amazon interview questions
Amazon interview questionsAmazon interview questions
Amazon interview questions
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 

Similar to Dimensional Fact Model @ BI Academy - 2016

Corey Sykes' Resume
Corey Sykes' ResumeCorey Sykes' Resume
Corey Sykes' ResumeCorey Sykes
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics Ruben Pertusa Lopez
 
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...Istituto nazionale di statistica
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanaJames L. Lee
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric IntroductionJames Serra
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environmentSasha Citino
 
A Data Fabric for All Things Intelligent
A Data Fabric for All Things IntelligentA Data Fabric for All Things Intelligent
A Data Fabric for All Things IntelligentDenodo
 
Knowage 8 presentation
Knowage 8   presentationKnowage 8   presentation
Knowage 8 presentationKNOWAGE
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data LakesLinked Enterprise Date Services
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDenodo
 
Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27Martin Bém
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBMongoDB
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Itay Braun
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Big Data Value Association
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesJames Serra
 
Introduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAPIntroduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAPugur candan
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Mark Tabladillo
 

Similar to Dimensional Fact Model @ BI Academy - 2016 (20)

Corey Sykes' Resume
Corey Sykes' ResumeCorey Sykes' Resume
Corey Sykes' Resume
 
AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics AzureML Welcome to the future of Predictive Analytics
AzureML Welcome to the future of Predictive Analytics
 
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hana
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environment
 
A Data Fabric for All Things Intelligent
A Data Fabric for All Things IntelligentA Data Fabric for All Things Intelligent
A Data Fabric for All Things Intelligent
 
Knowage 8 presentation
Knowage 8   presentationKnowage 8   presentation
Knowage 8 presentation
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
 
Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27Prague data management meetup #31 2020-01-27
Prague data management meetup #31 2020-01-27
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Introduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAPIntroduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAP
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

Dimensional Fact Model @ BI Academy - 2016

  • 1. Dimensional Fact Model Stuttgart, 9/3/2016 Stefano Cazzella @StefanoCazzella http://caccio.blogdns.net http://bimodeler.com stefano.cazzella{at}gmail.com 11BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella
  • 2. My Professional Timeline BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 2 2001 2003 2005 2007 2009 2011 2013 2015 Master degree in Software Engineering Business Intelligence Specialist Business Consultant Delivery Manager Methodology Industrialization of the delivery phase University of Rome « La Sapienza » Project Manager Datamat S.p.A. a Finmeccanica company Sopra Steria Group Consulting – IT Services – Software Solutions
  • 3. BI Trends BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 3 Data Integration Descriptive Predictive Prescriptive Deep learning Business Value Business Intelligence Data Warehouse Simulation & forecasting Optimization & automation Semantic & AI Time Digital transformation of every market Data explosion: exponential growth of digital data
  • 4. Disruptive scenario BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 4 Innovative technologies •Internet of Things •Big Data •Distributed computing •In Memory systems •Cloud •Mobile Complex architectures •Data federation •Data store •No SQL •Distributed file system •Appliances •Real-time data integration Business transformations •Frenetic time- to-market •API / service economy •Data-driven company •Business process automation … more … … more … … more …
  • 5. Business Design Build Business Desing Build New processes ? Roles ? Waterfall process Iterative process BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 5 Business Analyst Engineer Technician Data Scientist Business Analyst Engineer Technician
  • 6. Project Layers for Data Mart Business •Dimensional Fact Model Design •Relational model Build •DBMS specific DDL BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 6
  • 7. Why Dimensional Fact Model ? Formal language  well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic definition Simple and effective graphical notation (representation) Does not imply any technical/implementation choice Specifically designed to represent multi-dimensional models BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 7 1 2 3 4
  • 8. Multi-dimensional model The SALES event: On Nov. 25th, 2014 the Store 2 sold 10 pieces of Product X for a total revenue of € 220 BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 8 Product Store Day Product X Store 2 Store 1 Store 3 Product Y Units sold: 10 pieces Revenue: € 220 Product Z 3-dimensional SALES hyper-space
  • 9. DFM Notation Compendium BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 9 Hierarchy Dimension Dimensonal attribute Non-dimensonal attribute Measure Fact schema SALES Dependency
  • 10. Data Mart building process BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 10 Business user’s needs Model transformation Logical data model (Relational model: tables, columns, etc.) Phisical data model (DDL with indexes, partions, etc.) Model transformation Multidimensional data model (Dimensional Fact Model) Requirements definition Data Mart Deployment Implementation strategy Technical knowledge
  • 11. Data Mart building process BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 11 Business user’s needs Model transformation Logical data model (Relational model: tables, columns, etc.) Phisical data model (DDL with indexes, partions, etc.) Model transformation Multidimensional data model (Dimensional Fact Model) Requirements definition Data Mart Deployment Implementation strategy Technical knowledge
  • 12. Business - From requisite to DFM BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 12 • Context: weblog analytics - the analysis of the visits of several web sites belonging to different domains (eg. Google Analytics) • Requisite: monitoring and analyzing the number of visits and their monthly and daily average duration for each page of the websites, or each domain, distributed by the geographic region of the IP of the visitors. 12  Domain definition  Aggregation rules  Optional dependencies +
  • 13. Design choice •Star-schema (denormalized dimension table) •Snow-flake (hierarchies implemented by tables in 3NF) Reference ROLAP model: •Use natural key (the dimension attribute  PK column) •Use surrogate key (add a new column with no business meaning) •Use slow-changing dimension (SCD) of type 2 •Use implicit dimension (no dimension table, only a column in the fact table) Hierarchy implementation strategy (for every dimension) •Text  VARCHAR(250) ; Currency  NUMBER(9,2) ; etc. Domain  Data type association •Table name prefix (D for Dimensions, F for Facts) ; Number  NBR ; etc. Standard naming conventions and abbreviations BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 13
  • 14. Transform DFM in a Relational Model BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 14 Model transformation Fact grain Technical design choices: • Reference ROLAP model  star-schema • Hierarchy Viewer use surrogate key • Hierarchy Page  SCD – Type 2 • Hierarchy Time  denormalized with natural key Surrogate key SCD-2 Start date End date 14
  • 15. Build choice •Microsoft SqlServer – Oracle DBMS – SAP Hana– Apache Hive / Hadoop Choice the DBMS •Generate unique keys / primary keys / integrity constraints (foreign keys) Generate constraints? •Add clustered indexes / column-store indexes / bitmap indexes / etc. Add specific indexes •Organize fact tables in partitions (by hash, value, range, etc.) Define table partitions •Define file groups / tablespaces for tables, partitions, indexes Distribute data over multiple volumes BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 15
  • 16. In-Memory Computing Engine Session management Request Processing / Execution Control Transaction Manager Metadata Manager SQL Parser SQL ScriptCalc. Engine MDX Relational Engines Row Store Column Store Persistence LayerPage Management Logger Disk Storage Authorization Manager Data Volumes Log Volumes SAP HANA Architecture Row tables versus Column tables Partitioning by HASH, RANGE, ROUNDROBIN Use extended tables for warm data BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 16
  • 17. Phisical model and DDL BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 17 Implementation choices & best practice: • DBMS  SAP HANA • All tables are Column-tables • Fact F_VISITS partitioned by HASH on DAY • Fact F_VISITS indexed by PAGE Partition by HASH BTREE index 17 Unload priority for memory optimization Create a column table Preload columns for performance optimization
  • 18. BI Modeler BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 18 • In order to apply a model-driven approach, BI Project teams need a software tool to: Manage (draw) all the models - DFM, relational, etc. Support (and drive) the model transformation process • There was (are) no many tools able to do that so, in 2006 I started working on the development of … http://bimodeler.com
  • 19. DEMO BI ACADEMY Conference - Stuttgart, 9/3/2016 - Stefano Cazzella 19 Create a DFM from scratch Define the fact schema and its measures Add some dimensions / hierarchies Define and associate domains to attributes and measures Transform a DFM in a relational data model Define an implementation strategy for Hierarchies Associate Data type to domains Apply a naming convention Add physical properties to the relational model Choose a DBMS Create partitions Create indexes Generate DDL script