SlideShare a Scribd company logo
1 of 34
Data Warehouse
Design
Ines.beltaief.54
Inesbeltaief8@gmail.com
Outline
1.Context
2.Introduction
3.Problematic
4.Papers
5.Comparaison
6.Discussion
7.Conclusion
Context
 In this work ,I’ll present the Data wherehouse
concept, Different methods of modeling and
advantages of each proposed methods.
 For this, I worked on 3 papers talking about The
X-META (a methodology for data warehouse
design with metadata management) & gave
conceptual data models for dw design.
Introduction
 In computing, a data warehouse (DW or DWH), also known as
an enterprise data warehouse (EDW), is a system used
for reporting and data analysis.
 Date Warehousing is a powerful tool for supporting decision-
making processes in modern corporations. However, developing a
Data Warehouse (DW) is a complex and costly activity. It requires
strategies, which should be specific to the characteristics and
needs of the organization where it will be introduced.
Problematic
How to design a Data
wharehouse ?
What are the method
studied ?
Which are the steps to
design a DW?
Papers
 A DataWarehouse Conceptual Data Model : Enrico
Franconi and Anand Kamble ( Faculty of Computer Science, Free
University of Bozen-Bolzano, Italy )
 X-META: A Methodology for Data Warehouse Design
with Metadata Management : Liane Carneiro - Angelo
Brayner ( University of Fortaleza – UNIFOR Fortaleza - Ceará,
Brazil )
 A comprehensive Method for Data warehouse design :
Sergio Lujan - Mora and Juan Trujillo ( Departement of Software
and Computing Systems University of Alicante Spain )
Paper 1 : X-META
Abstract:
the contributions of this paper are
twofold. First of all, it proposes a
methodology, which addresses the
development of DW projects.
Second, it presents a strategy to
cope with metadata management
integrated to the DW
development process.
Related Works:
the existent methodologies are
normally addressed to generic
environments. Thus, they do
not address DW development in
organizations that need a
methodology adapted to their
characteristics and
expectations
Metadata Creation & Management
Metadata is “Data that provides information about
other data". Two types of metadata exist:
 structural metadata & descriptive metadata.
 Structural metadata is data about the containers of
data.
 Descriptive metadata uses individual instances of
application data or the data content.
Metadata Creation & Management
Metadata should exist throughout the development
process and during all the useful life of the DW.
Therefore, metadata, in a DW project, play a key role.
 It is a big challenge to many DW projects, because
there exists much heterogeneity among tools and
products for creating and managing metadata in a
Data Warehounsing environment.
X-META: A METHODOLOGY FOR DW
DESIGN
 The key idea of the proposed
methodology is to start with a
DW pilot project in order to
introduce experience in
constructing a DW into the
internal team and to prove the
viability and the DW
importance to an organization.
 The overall methodology
lifecycle is divided into 5 major
phases, as depicted in figure 1.
The X-Meta methodology defines three distinct iteration types,
each one with its own specific purpose:
(3) DM/DW Project – In this
level, many pilot projects may
have already been developed
and many uncertainties have
already been eliminated.
Besides, the development
process in this iteration type,
which corresponds to DM/DW
projects, uses the same
methodology used in the pilot
project iteration type.
(2) Pilot Project – it allows the
incremental and evolutionary
development of pilot projects, each
of which with its own purpose, for
instance: testing products,
acquiring experience, developing
metadata repositories, etc. Each
pilot project represents an
iteration, which begins in the
Planning phase and uses one or
more input from the DM/DW
Construction Cycle phase,
depending on the evolutionary
level and on the project goal.
(1) First prototype – it has
the main goal of allowing
the insertion of the DW
technology in the
organization. This iteration,
which is executed only once
in the DW development
process, uses only the
Introduction phase in the
development of the first pilot
project in the organization;
X-META: A METHODOLOGY FOR DW
DESIGN
 A presented example of possible projects to be developed in an organization using
the methodology presented in this paper. Each spiral represents a project and each
project uses a specific iteration type.
X-META: A METHODOLOGY FOR DW
DESIGN
Metadata modeling
 It has the goal of defining and controlling the activities related to
the metadata, supplying corporative directions for its creation, use
and maintenance.
 there is a list of activities to be performed in this module. Those
activities have the goal of producing a metadata model that will be
used during the construction of a metadata repository and during
the metadata management phase:
 Define the metadata management team and its responsibilities;
 Define and classify metadata types that will be stored;
 Define user types a repository and the equivalent access levels;
 Determine the metadata sources in the organization
 Define and construct the metadata model and metadata flow.
Metadata infrastructure
The metadata repository is an infrastructure component of the DW
environment that is useful to all other environment components,
working as a tool to help in the integration. The goal of this module is to
define the general metadata infrastructure in the organization.
 The activities belonging to this module are:
 Construction of an architecture for the metadata (centralized,
decentralized, distributed);
 Evaluation of metadata tools in order to select the best one to achieve
the project goals.
 Physical implementation of the metadata repository;
 Defining security procedures.
Abstract :
 The goal of the work is to extend the standard ER conceptual data
model, as defined in the database textbooks, with constructs which
allow the modelling of multi-dimensional aggregated entities together
with their interrelationships with the other parts of the conceptual
schema.
 This work is also based on a similar preliminary work done by one
author on the use of Description Logics as a mean to give precise
semantics to a data warehouse conceptual data model and to study its
computational properties.
Paper 2 : A DataWarehouse Conceptual
Data Model
A Data Warehouse Conceptual Schema
Explication
 the entity Calls represents a basic cube whose dimensions are Date,
Destination, Source, which are restricted to the basic levels Day, Point, and
again Point, respectively.
 A first extension of the language can be seen with the simple aggregated
entities, non-dimensional aggregations—Weekday and Customertype, which
represent dimensional levels built from the basic dimensional entities Day
and Point, respectively.
 A simple aggregations aggregate the collections of objects that are in the
extension of the aggregated entities. So, in our example, since the entities
Mon, . . ., Sun form a partition of the entity Day, the Weekday entity denotes
exactly seven objects, one for all the Mondays, one for all the Tuesdays,etc.
 On the other hand, the aggregated entity Customertype denotes
exactly two objects, consumer and business. In this way, by
interleaving partitioning and simple aggregations, we are able to
construct level hierarchies starting from some basic dimensional
level.
 A second extension of the language is the multidimensional
aggregated entity, by the entity Calls-by-Weekday-and-
Customertype. This entity denotes all the cells of a cube whose
coordinates are the weekdays of the date of the calls, and the
customer types of the originators of the calls.
 A multi-dimensional aggregated entity is an entity itself in the ER
diagram, and it can have attributes and can be part of further
relationships or constraints.
Explication
Paper 3 : A Comprehensive Method For
DW Design
Abstract:
Various methods and appraches
have been presented for designing
different parts of DW,but, no
general and standard method exists
to date for dealing with the whole
design of a DW .In this paper we
fill this gap by presenting a method
based on UML allows the user to
tackle all DW design phases and
steps from the the operational data
source to the final implementation
and including the definition of ETL
Processes .
Related Works:
Different case of studies of data
marts (DM) are presented.The MD
modeling is based in the use of the
schema and it’s different varaitions.
Most recently , a method for the
DW is proposed which basn a MD
model called IDEA and it proposes
a set of steps to adress the
conceptual , logical and physical
design of a DW.
Overview of the method
 We have adopted the OO paradigm because it is
semantically richer that others and it offers
numerous advantages.
 The design of a DW is a joint effort of DW
developpers ( Technical user ) and final
user ( the user who are only interested in
the business content ).
 Therefore, a powerfull method with the correspond
models is needed and we believe the OO paradigm
is the best approach for the DW design .
Design for a data wharehouse
Overview of the method
Design for a data wharehouse
The architecture of DW is depicted as various layers ( every layer is derived
from the previous layer ) .
 Following this consideration , we consider that the developpement of a
DW can be structered into an integrated model with four different shemas :
o Operational Data Shemas (ODS) : Defines the structure of the operational
and external data sources.
o DW Conceptual Schema (DWCS) : Defines the conceptual shema of DW .
o DW Storage Shema (DWSS) : Defines the physical storage of the DW
depending on the target platform .
o Business model (BM) : it defines the different ways or view of accessing the
DW from final user’ point of view .
Overview of the method
 Two shcema mapping are also needed in ordre to obtain a global and
integrated DW design approach that covers the necessary shcemas:
ETL Process : Defines the mapping between the ODS & DWCS.
Exportation Process : Defines the mapping between the DWCS &
DWSS.
Design for a data wharehouse
Overview of the method
 This method accomplish each one of the shcemas and mapping in an
integrated manner : We use a modeling notion based on the UML.
 Data warehouse conceptual schema (DWCS) : The most important feature of
the MD paradigm is dividing data into facts and dimensions, to provide data
on a suitable level of granuliraty, hierarchies are defined on the dimensions.
Overview of the method
Star
Schema1
Star
Schema2
Dimension1 Dimension2
Fact 1
Dimension2
Dimension2
Level 1
Dimension2
Level 3
Dimension 2
Level 2
 The UML Profile includes the use of the UML package , in this way , when
modeling complex and large DW systems, we are not restricted to use flat
UML class diagrams ans , therefore, cluttered diagrams are avoided.
Design for a data wharehouse
 LeveL 1: Model Definition . A package represent a star schema of a
conceptual MD level. A depency between two package indicates that the
star schemas share at least one dimension.
 LeveL 2: Star Schemas Definition . A package represents a fact or a
dimension of a star schema. A depency between two dimension package
indicates that the package share at least one level of a dimension
hierarchy.
 LeveL 3: Dimension/fact definition. A package from the second level is
exploded into a set of classes that represent the hierarchy levels in a
dimension package ,or the whole star schema in the case of the fact
package.
Overview of the method
Design for a data wharehouse
How to apply the Method
We propose a set of steps to guide
the design of a DW following our
approach .
The UML diagram is devided into
two swimlanes depending ion who
leads the activities : DW final
user ( final users guide the work
of DW designers and
administrators ) and DW
designers & administrators
( they don’t need the help of final
users because all the needed
informations has been previously
gathered ).
Activities where the
schemas are created
Activities where the
mapping between
schemas are created
 Different substeps can be achieved during requirements gathering ;
specifically, the designer has to :
1) Analysis :
*Determine the
desired data
format,level &
elements users wish
*Classify different
summaries.
*Define acces
control & security
rules.
*Help the final user
to understand wht
they do not know
they need whereas
others are less
common .
How to apply the Method
 DWCS : two “extreme” strategies can be adopted in this activity : top-down
(definition of DWCS based on the final user requirements) , bottom-down (
definition of DWCS based on the available data sources ).
We suggest to adopt a combined solution : the DW is designed from the
final user’s requirements, but bearing in mind the available data source.
 ETL process : it’s defined as a mapping between the data source and the
DWCS.
 this activity and the previous one define a cycle, because during the
definition of the ELT process some errors in the DWCS can be detected and ,
therefore, the DWCS may be modified.
 Define DMs (BM): different models in the BM are defined from the final
user’s initiale requirements and the DWCS, the BM can be implemented as
real or virtual DMs.
2) Design :
How to apply the Method
 Define Storage (DWSS) : the target platform is selected ( relational,
OO..) and the corresponding logical schema (DWSS) is defined ; the
query performance can be improved by simplifying the data schema(
so that it only contains the essential data ) or by the definition of
summaries ( aggregates ) based on the final user’s requirement .
 Define exportation process : the mapping between DWCS and DWSS
are manually or automatically defined .
 Implement Reports & queries : requested by the final users and
implemented in the query tool used.
3) Implementation :
How to apply the Method
 Validate DW : the solution obtained ( the DW built ) is checked against the
existing problem (final user requirements ) . If any discrepancy exists, some
corrective actions can be taken and the process can return to one of the
previous activities.
4) Test :
How to apply the Method
 No Pratique
application yet
 Based on Uml
modeling
 A tutorial is
available
online.
Uses multiple
extended EER or
UML diagrams
Together.
 The proposed
methodology is
being used to
construct a DW
for a department
of the Fortaleza
City Hall.
 New concept
Compariason
 This work present a
global data warehouse
design method that is
based on UML for
designing the different
DW schemas and the
corresponding
transformation in an
integrated manner .
 The goal of the work is to
extend the standard ER
conceptual data model,
as defined in the
database textbooks, with
constructs which allow
the modelling of multi-
dimensional aggregated
entities together with
their interrelationships
with the other parts of
the conceptual Schema.
 X-Meta addresses the
problem of developing a
first DW project in
corporations, which do
not have staff with
practical experience in
such development.
Furthermore, it
integrates metadata
creation and
management to the DW
developing process.
Discussion
Complicated
Notions Brief explanation
with many
missing parts
Complete and
Clear work
 Data warehousing is a business analyst's dream — all the
informations about the organization's activities gathered in one
place, open to a single set of analytical tools. But how do you
make the dream a reality?
 You have to plan your data warehouse system. You must
understand what questions users will ask it because the purpose
of a data warehouse system is to provide decision-makers the
accurate, timely information they need to make the right choices.
Conclusion
Data warehouse design

More Related Content

What's hot

Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse FundamentalsRashmi Bhat
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processingSamraiz Tejani
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification303Computing
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentalsrjain51
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingKamal Acharya
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Differencejeetendra mandal
 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMSShailesh Pachori
 
9. Object Relational Databases in DBMS
9. Object Relational Databases in DBMS9. Object Relational Databases in DBMS
9. Object Relational Databases in DBMSkoolkampus
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streamshktripathy
 

What's hot (20)

OLAP v/s OLTP
OLAP v/s OLTPOLAP v/s OLTP
OLAP v/s OLTP
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
Metadata ppt
Metadata pptMetadata ppt
Metadata ppt
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMS
 
File organization
File organizationFile organization
File organization
 
9. Object Relational Databases in DBMS
9. Object Relational Databases in DBMS9. Object Relational Databases in DBMS
9. Object Relational Databases in DBMS
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 

Viewers also liked

Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Enterprise system architecture togaf
Enterprise system architecture togafEnterprise system architecture togaf
Enterprise system architecture togafines beltaief
 
Mobile Cloud Comuting
Mobile Cloud Comuting Mobile Cloud Comuting
Mobile Cloud Comuting ines beltaief
 
Face recognition
Face recognition Face recognition
Face recognition Chandan A V
 
Face Recognition Human Computer Interaction
Face Recognition Human Computer InteractionFace Recognition Human Computer Interaction
Face Recognition Human Computer Interactionines beltaief
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformSafe Software
 
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationDenodo
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayTorana, Inc.
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsDataWorks Summit/Hadoop Summit
 
Data Warehouse Design & Dimensional Modeling
Data Warehouse Design & Dimensional ModelingData Warehouse Design & Dimensional Modeling
Data Warehouse Design & Dimensional ModelingCode Mastery
 
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
 
Designing and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkDesigning and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkBharat Vadlamudi
 
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Denodo
 
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Andreas Buckenhofer
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieAndreas Buckenhofer
 
Data Warehouse Programme Notes
Data Warehouse Programme NotesData Warehouse Programme Notes
Data Warehouse Programme NotesAlan McSweeney
 

Viewers also liked (20)

Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Enterprise system architecture togaf
Enterprise system architecture togafEnterprise system architecture togaf
Enterprise system architecture togaf
 
Mobile Cloud Comuting
Mobile Cloud Comuting Mobile Cloud Comuting
Mobile Cloud Comuting
 
Face recognition
Face recognition Face recognition
Face recognition
 
Face Recognition Human Computer Interaction
Face Recognition Human Computer InteractionFace Recognition Human Computer Interaction
Face Recognition Human Computer Interaction
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job Platform
 
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile way
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
Data Warehouse Design & Dimensional Modeling
Data Warehouse Design & Dimensional ModelingData Warehouse Design & Dimensional Modeling
Data Warehouse Design & Dimensional Modeling
 
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
 
Designing and implementing_an_etl_framework
Designing and implementing_an_etl_frameworkDesigning and implementing_an_etl_framework
Designing and implementing_an_etl_framework
 
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
 
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 3 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Data Warehouse Programme Notes
Data Warehouse Programme NotesData Warehouse Programme Notes
Data Warehouse Programme Notes
 

Similar to Data warehouse design

1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...JOHNLEAK1
 
Object Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentObject Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentRishabh Soni
 
Software_Engineering_Presentation (1).pptx
Software_Engineering_Presentation (1).pptxSoftware_Engineering_Presentation (1).pptx
Software_Engineering_Presentation (1).pptxArifaMehreen1
 
Exploring Data Modeling Techniques in Modern Data Warehouses
Exploring Data Modeling Techniques in Modern Data WarehousesExploring Data Modeling Techniques in Modern Data Warehouses
Exploring Data Modeling Techniques in Modern Data Warehousespriyanka rajput
 
DATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATIONDATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATIONijcsit
 
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...ijcsit
 
Project PlanFor our Project Plan, we are going to develop.docx
Project PlanFor our Project Plan, we are going to develop.docxProject PlanFor our Project Plan, we are going to develop.docx
Project PlanFor our Project Plan, we are going to develop.docxwkyra78
 
Rando Veizi: Data warehouse and Pentaho suite
Rando Veizi: Data warehouse and Pentaho suiteRando Veizi: Data warehouse and Pentaho suite
Rando Veizi: Data warehouse and Pentaho suiteCarlo Vaccari
 
Text Document Classification System
Text Document Classification SystemText Document Classification System
Text Document Classification SystemIRJET Journal
 
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"GlobalLogic Ukraine
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesRaphael Branger
 
An ontological approach to handle multidimensional schema evolution for data ...
An ontological approach to handle multidimensional schema evolution for data ...An ontological approach to handle multidimensional schema evolution for data ...
An ontological approach to handle multidimensional schema evolution for data ...ijdms
 
2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarellitruongthuthuy47
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxShanmugasundaram M
 

Similar to Data warehouse design (20)

1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Object Oriented Approach for Software Development
Object Oriented Approach for Software DevelopmentObject Oriented Approach for Software Development
Object Oriented Approach for Software Development
 
Software_Engineering_Presentation (1).pptx
Software_Engineering_Presentation (1).pptxSoftware_Engineering_Presentation (1).pptx
Software_Engineering_Presentation (1).pptx
 
Exploring Data Modeling Techniques in Modern Data Warehouses
Exploring Data Modeling Techniques in Modern Data WarehousesExploring Data Modeling Techniques in Modern Data Warehouses
Exploring Data Modeling Techniques in Modern Data Warehouses
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
DATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATIONDATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATION
 
Data Warehouse and Big Data Integration
Data Warehouse and Big Data IntegrationData Warehouse and Big Data Integration
Data Warehouse and Big Data Integration
 
DATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATIONDATA WAREHOUSE AND BIG DATA INTEGRATION
DATA WAREHOUSE AND BIG DATA INTEGRATION
 
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R ModellingData Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
 
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...
DESIGN AND DEVELOPMENT OF BUSINESS RULES MANAGEMENT SYSTEM (BRMS) USING ATLAN...
 
Sq lite module2
Sq lite module2Sq lite module2
Sq lite module2
 
Project PlanFor our Project Plan, we are going to develop.docx
Project PlanFor our Project Plan, we are going to develop.docxProject PlanFor our Project Plan, we are going to develop.docx
Project PlanFor our Project Plan, we are going to develop.docx
 
Rando Veizi: Data warehouse and Pentaho suite
Rando Veizi: Data warehouse and Pentaho suiteRando Veizi: Data warehouse and Pentaho suite
Rando Veizi: Data warehouse and Pentaho suite
 
Text Document Classification System
Text Document Classification SystemText Document Classification System
Text Document Classification System
 
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"
Java TechTalk "Spring boot made life easier with Kubernetes and Microservices"
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
 
An ontological approach to handle multidimensional schema evolution for data ...
An ontological approach to handle multidimensional schema evolution for data ...An ontological approach to handle multidimensional schema evolution for data ...
An ontological approach to handle multidimensional schema evolution for data ...
 
2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Data warehouse design

  • 3. Context  In this work ,I’ll present the Data wherehouse concept, Different methods of modeling and advantages of each proposed methods.  For this, I worked on 3 papers talking about The X-META (a methodology for data warehouse design with metadata management) & gave conceptual data models for dw design.
  • 4. Introduction  In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis.  Date Warehousing is a powerful tool for supporting decision- making processes in modern corporations. However, developing a Data Warehouse (DW) is a complex and costly activity. It requires strategies, which should be specific to the characteristics and needs of the organization where it will be introduced.
  • 5. Problematic How to design a Data wharehouse ? What are the method studied ? Which are the steps to design a DW?
  • 6. Papers  A DataWarehouse Conceptual Data Model : Enrico Franconi and Anand Kamble ( Faculty of Computer Science, Free University of Bozen-Bolzano, Italy )  X-META: A Methodology for Data Warehouse Design with Metadata Management : Liane Carneiro - Angelo Brayner ( University of Fortaleza – UNIFOR Fortaleza - Ceará, Brazil )  A comprehensive Method for Data warehouse design : Sergio Lujan - Mora and Juan Trujillo ( Departement of Software and Computing Systems University of Alicante Spain )
  • 7. Paper 1 : X-META Abstract: the contributions of this paper are twofold. First of all, it proposes a methodology, which addresses the development of DW projects. Second, it presents a strategy to cope with metadata management integrated to the DW development process. Related Works: the existent methodologies are normally addressed to generic environments. Thus, they do not address DW development in organizations that need a methodology adapted to their characteristics and expectations
  • 8. Metadata Creation & Management Metadata is “Data that provides information about other data". Two types of metadata exist:  structural metadata & descriptive metadata.  Structural metadata is data about the containers of data.  Descriptive metadata uses individual instances of application data or the data content.
  • 9. Metadata Creation & Management Metadata should exist throughout the development process and during all the useful life of the DW. Therefore, metadata, in a DW project, play a key role.  It is a big challenge to many DW projects, because there exists much heterogeneity among tools and products for creating and managing metadata in a Data Warehounsing environment.
  • 10. X-META: A METHODOLOGY FOR DW DESIGN  The key idea of the proposed methodology is to start with a DW pilot project in order to introduce experience in constructing a DW into the internal team and to prove the viability and the DW importance to an organization.  The overall methodology lifecycle is divided into 5 major phases, as depicted in figure 1.
  • 11. The X-Meta methodology defines three distinct iteration types, each one with its own specific purpose: (3) DM/DW Project – In this level, many pilot projects may have already been developed and many uncertainties have already been eliminated. Besides, the development process in this iteration type, which corresponds to DM/DW projects, uses the same methodology used in the pilot project iteration type. (2) Pilot Project – it allows the incremental and evolutionary development of pilot projects, each of which with its own purpose, for instance: testing products, acquiring experience, developing metadata repositories, etc. Each pilot project represents an iteration, which begins in the Planning phase and uses one or more input from the DM/DW Construction Cycle phase, depending on the evolutionary level and on the project goal. (1) First prototype – it has the main goal of allowing the insertion of the DW technology in the organization. This iteration, which is executed only once in the DW development process, uses only the Introduction phase in the development of the first pilot project in the organization; X-META: A METHODOLOGY FOR DW DESIGN
  • 12.  A presented example of possible projects to be developed in an organization using the methodology presented in this paper. Each spiral represents a project and each project uses a specific iteration type. X-META: A METHODOLOGY FOR DW DESIGN
  • 13. Metadata modeling  It has the goal of defining and controlling the activities related to the metadata, supplying corporative directions for its creation, use and maintenance.  there is a list of activities to be performed in this module. Those activities have the goal of producing a metadata model that will be used during the construction of a metadata repository and during the metadata management phase:  Define the metadata management team and its responsibilities;  Define and classify metadata types that will be stored;  Define user types a repository and the equivalent access levels;  Determine the metadata sources in the organization  Define and construct the metadata model and metadata flow.
  • 14. Metadata infrastructure The metadata repository is an infrastructure component of the DW environment that is useful to all other environment components, working as a tool to help in the integration. The goal of this module is to define the general metadata infrastructure in the organization.  The activities belonging to this module are:  Construction of an architecture for the metadata (centralized, decentralized, distributed);  Evaluation of metadata tools in order to select the best one to achieve the project goals.  Physical implementation of the metadata repository;  Defining security procedures.
  • 15. Abstract :  The goal of the work is to extend the standard ER conceptual data model, as defined in the database textbooks, with constructs which allow the modelling of multi-dimensional aggregated entities together with their interrelationships with the other parts of the conceptual schema.  This work is also based on a similar preliminary work done by one author on the use of Description Logics as a mean to give precise semantics to a data warehouse conceptual data model and to study its computational properties. Paper 2 : A DataWarehouse Conceptual Data Model
  • 16. A Data Warehouse Conceptual Schema
  • 17. Explication  the entity Calls represents a basic cube whose dimensions are Date, Destination, Source, which are restricted to the basic levels Day, Point, and again Point, respectively.  A first extension of the language can be seen with the simple aggregated entities, non-dimensional aggregations—Weekday and Customertype, which represent dimensional levels built from the basic dimensional entities Day and Point, respectively.  A simple aggregations aggregate the collections of objects that are in the extension of the aggregated entities. So, in our example, since the entities Mon, . . ., Sun form a partition of the entity Day, the Weekday entity denotes exactly seven objects, one for all the Mondays, one for all the Tuesdays,etc.
  • 18.  On the other hand, the aggregated entity Customertype denotes exactly two objects, consumer and business. In this way, by interleaving partitioning and simple aggregations, we are able to construct level hierarchies starting from some basic dimensional level.  A second extension of the language is the multidimensional aggregated entity, by the entity Calls-by-Weekday-and- Customertype. This entity denotes all the cells of a cube whose coordinates are the weekdays of the date of the calls, and the customer types of the originators of the calls.  A multi-dimensional aggregated entity is an entity itself in the ER diagram, and it can have attributes and can be part of further relationships or constraints. Explication
  • 19. Paper 3 : A Comprehensive Method For DW Design Abstract: Various methods and appraches have been presented for designing different parts of DW,but, no general and standard method exists to date for dealing with the whole design of a DW .In this paper we fill this gap by presenting a method based on UML allows the user to tackle all DW design phases and steps from the the operational data source to the final implementation and including the definition of ETL Processes . Related Works: Different case of studies of data marts (DM) are presented.The MD modeling is based in the use of the schema and it’s different varaitions. Most recently , a method for the DW is proposed which basn a MD model called IDEA and it proposes a set of steps to adress the conceptual , logical and physical design of a DW.
  • 20. Overview of the method  We have adopted the OO paradigm because it is semantically richer that others and it offers numerous advantages.  The design of a DW is a joint effort of DW developpers ( Technical user ) and final user ( the user who are only interested in the business content ).  Therefore, a powerfull method with the correspond models is needed and we believe the OO paradigm is the best approach for the DW design .
  • 21. Design for a data wharehouse Overview of the method
  • 22. Design for a data wharehouse The architecture of DW is depicted as various layers ( every layer is derived from the previous layer ) .  Following this consideration , we consider that the developpement of a DW can be structered into an integrated model with four different shemas : o Operational Data Shemas (ODS) : Defines the structure of the operational and external data sources. o DW Conceptual Schema (DWCS) : Defines the conceptual shema of DW . o DW Storage Shema (DWSS) : Defines the physical storage of the DW depending on the target platform . o Business model (BM) : it defines the different ways or view of accessing the DW from final user’ point of view . Overview of the method
  • 23.  Two shcema mapping are also needed in ordre to obtain a global and integrated DW design approach that covers the necessary shcemas: ETL Process : Defines the mapping between the ODS & DWCS. Exportation Process : Defines the mapping between the DWCS & DWSS. Design for a data wharehouse Overview of the method  This method accomplish each one of the shcemas and mapping in an integrated manner : We use a modeling notion based on the UML.
  • 24.  Data warehouse conceptual schema (DWCS) : The most important feature of the MD paradigm is dividing data into facts and dimensions, to provide data on a suitable level of granuliraty, hierarchies are defined on the dimensions. Overview of the method Star Schema1 Star Schema2 Dimension1 Dimension2 Fact 1 Dimension2 Dimension2 Level 1 Dimension2 Level 3 Dimension 2 Level 2  The UML Profile includes the use of the UML package , in this way , when modeling complex and large DW systems, we are not restricted to use flat UML class diagrams ans , therefore, cluttered diagrams are avoided. Design for a data wharehouse
  • 25.  LeveL 1: Model Definition . A package represent a star schema of a conceptual MD level. A depency between two package indicates that the star schemas share at least one dimension.  LeveL 2: Star Schemas Definition . A package represents a fact or a dimension of a star schema. A depency between two dimension package indicates that the package share at least one level of a dimension hierarchy.  LeveL 3: Dimension/fact definition. A package from the second level is exploded into a set of classes that represent the hierarchy levels in a dimension package ,or the whole star schema in the case of the fact package. Overview of the method Design for a data wharehouse
  • 26. How to apply the Method We propose a set of steps to guide the design of a DW following our approach . The UML diagram is devided into two swimlanes depending ion who leads the activities : DW final user ( final users guide the work of DW designers and administrators ) and DW designers & administrators ( they don’t need the help of final users because all the needed informations has been previously gathered ). Activities where the schemas are created Activities where the mapping between schemas are created
  • 27.  Different substeps can be achieved during requirements gathering ; specifically, the designer has to : 1) Analysis : *Determine the desired data format,level & elements users wish *Classify different summaries. *Define acces control & security rules. *Help the final user to understand wht they do not know they need whereas others are less common . How to apply the Method
  • 28.  DWCS : two “extreme” strategies can be adopted in this activity : top-down (definition of DWCS based on the final user requirements) , bottom-down ( definition of DWCS based on the available data sources ). We suggest to adopt a combined solution : the DW is designed from the final user’s requirements, but bearing in mind the available data source.  ETL process : it’s defined as a mapping between the data source and the DWCS.  this activity and the previous one define a cycle, because during the definition of the ELT process some errors in the DWCS can be detected and , therefore, the DWCS may be modified.  Define DMs (BM): different models in the BM are defined from the final user’s initiale requirements and the DWCS, the BM can be implemented as real or virtual DMs. 2) Design : How to apply the Method
  • 29.  Define Storage (DWSS) : the target platform is selected ( relational, OO..) and the corresponding logical schema (DWSS) is defined ; the query performance can be improved by simplifying the data schema( so that it only contains the essential data ) or by the definition of summaries ( aggregates ) based on the final user’s requirement .  Define exportation process : the mapping between DWCS and DWSS are manually or automatically defined .  Implement Reports & queries : requested by the final users and implemented in the query tool used. 3) Implementation : How to apply the Method
  • 30.  Validate DW : the solution obtained ( the DW built ) is checked against the existing problem (final user requirements ) . If any discrepancy exists, some corrective actions can be taken and the process can return to one of the previous activities. 4) Test : How to apply the Method
  • 31.  No Pratique application yet  Based on Uml modeling  A tutorial is available online. Uses multiple extended EER or UML diagrams Together.  The proposed methodology is being used to construct a DW for a department of the Fortaleza City Hall.  New concept Compariason
  • 32.  This work present a global data warehouse design method that is based on UML for designing the different DW schemas and the corresponding transformation in an integrated manner .  The goal of the work is to extend the standard ER conceptual data model, as defined in the database textbooks, with constructs which allow the modelling of multi- dimensional aggregated entities together with their interrelationships with the other parts of the conceptual Schema.  X-Meta addresses the problem of developing a first DW project in corporations, which do not have staff with practical experience in such development. Furthermore, it integrates metadata creation and management to the DW developing process. Discussion Complicated Notions Brief explanation with many missing parts Complete and Clear work
  • 33.  Data warehousing is a business analyst's dream — all the informations about the organization's activities gathered in one place, open to a single set of analytical tools. But how do you make the dream a reality?  You have to plan your data warehouse system. You must understand what questions users will ask it because the purpose of a data warehouse system is to provide decision-makers the accurate, timely information they need to make the right choices. Conclusion