SlideShare a Scribd company logo
welcome
Presented By, 
Neenu C. Paul(12120051) 
CS B, S7 
SOE, CUSAT 
Guided By, 
Dr. Sudheep Elayidom 
Division of Computer Science 
SOE, CUSAT
CONTENTS 
• What is a data warehouse? 
• What is data warehousing? 
• Database vs Data warehouse 
• OLTP & OLAP 
• Data warehouse architecture 
• Multidimensional data model 
• Data Mart 
• ETL 
• Advantages of data warehouse 
• Disadvantages of data warehouse 
• S/W Solutions of data warehouse 
• Conclusion 
• References
A producer wants to know…. 
Which are our 
lowest/highest margin 
customers ? 
Who are my customers 
and what products 
are they buying? 
What is the most 
effective distribution 
channel? 
What product prom- 
-otions have the biggest 
impact on revenue? What impact will 
new products/services 
have on revenue 
and margins? 
Which customers 
are most likely to go 
to the competition ?
What is a Data Warehouse?? 
• A data warehouse is an appliance for storing and analyzing data, and 
reporting. 
• Central database that includes information from several different 
sources. 
• Keeps current as well as historical data. 
• Used to produce reports to assist in decision-making and management.
“Data Warehouse is a subject 
oriented, integrated, time-variant 
and non-volatile 
collection of data in support of 
management’s decision making 
process.” –W. H. Inmon 
Subject 
Oriented 
Data 
Warehouse 
Integrated 
Time 
Variant 
Non-volatile
What is Data Warehousing? 
A process of transforming data 
into information and making it 
available to users in a timely 
enough manner to make a 
difference 
Data 
Information
Database vs Data Warehouse 
Database 
• Transaction Oriented 
• For saving online bargain data 
• E-R modeling techniques are 
used for designing 
• Capture data 
• Constitute real time information 
Data Warehouse 
• Subject oriented 
• For saving historical data 
• Data modeling techniques are 
used for designing. 
• Analyze data 
• Constitute entire information 
base for all time.
Data Processing Technologies 
• OLTP (on-line transaction processing) 
- The major task is to perform on-line 
transaction and query processing. Covers 
most of the day-to-day operations of an 
organization. 
• OLAP(On-Line Analytical Processing) 
- Serve knowledge workers(users) in the 
role of data analysis and decision making. 
- Organize and present data in various 
formats to accommodate the diverse needs 
of the different users. 
Data Processing 
Technologies 
OLTP OLAP
OLTP vs OLAP 
OLTP OLAP 
users clerk, IT professional knowledge worker 
function day to day operations decision support 
DB design application-oriented subject-oriented 
data current, up-to-date 
detailed, flat relational 
isolated 
historical, 
summarized, multidimensional 
integrated, consolidated 
usage repetitive ad-hoc 
access read/write dozens of records Millions of record read 
unit of work short, simple transaction complex query 
# records accessed tens millions 
#users thousands hundreds 
DB size 100MB-GB 100GB-TB
11 October 31, 2014 
To summarize ... 
 OLTP Systems are 
used to “run” a business 
 The Data Warehouse helps 
to “optimize” the business
Typical DW Architecture 
Data Sources ETL Data Store Data Access Presentation 
System A 
System B 
System C 
System D 
Extract 
Transform 
Load 
The Data 
Warehouse 
Business Model 
Dashboards 
Prompted Views 
Scorecards 
Ad-Hoc Reporting 
Self Serve 
12
Multidimensional data model 
• Developed for implementing data warehouse and data marts. 
• Provides both a mechanism to store data and a way for business 
analysis. 
• An alternative to entity-relationship (E/R) model 
TYPES OF MULTIDIMENSIONAL DATA MODEL 
Data cube model. 
Star schema model. 
Snow flake schema model. 
Fact Constellations.
Data cubes 
• A data warehouse is based on a multidimensional data model which views data in 
the form of a data cube. 
• Three important concepts are associated with data cubes 
- Slicing 
- Dicing 
- Rotating 
•In the cube given below we have the results of the 1991 Canadian Census with 
ethnic origin, age group and geography representing the dimensions of the cube, 
while 174 represents the measure. The dimension is a category of data. Each 
dimension includes different levels of categories. The measures are actual data 
values that occupy the cells as defined by the dimensions selected.
1991 Canadian Census 
15
Slicing the Data Cube 
• Figure 2 illustrates slicing the 
Ethnic origin Chinese. When the 
cube is sliced like in this example, 
we are able to generate data for 
Chinese origin for the geography 
and age groups as a result. 
• The data that is contained within 
the cube has effectively been 
filtered in order to display the 
measures associated only with the 
Chinese ethnic origin. 
• From an end user perspective, the 
term slice most often refers to a 
two- dimensional page selected 
from the cube. 
16
Dicing and Rotating 
• Dicing is a related operation to slicing 
in which a sub-cube of the original 
space is defined 
• Dicing provides the user with the 
smallest available slice of data, 
enabling you to examine each sub-cube 
in greater detail. 
• Rotating, which is sometimes called 
pivoting changes the dimensional 
orientation of the report or page 
display from the cube data. Rotating 
may consist of swapping the rows an 
columns, or moving one of the row 
dimensions into the column 
dimension. 
17
Data Mart 
• Contains a subset of the data stored in the data warehouse that is of 
interest to a specific business community, department, or set of users. 
• E.g.: Marketing promotions, finance ,or account collections. 
• Data marts are small slices of the data warehouse. 
• Data marts improve end-user response time by allowing users to have 
access to the specific type of data they need to view. 
• A data mart is basically a condensed and more focused version of a 
data warehouse.
Data warehouse vs Data mart 
DATA WAREHOUSE 
• Holds multiple subject areas 
• Holds very detailed information 
• Works to integrate all data 
sources 
• Does not necessarily use a 
dimensional model but feeds 
dimensional models 
DATA MART 
• Often holds only one subject area-for 
example, Finance, or Sales 
• May hold more summarized data 
(although many hold full detail) 
• Concentrates on integrating 
information from a given subject 
area or set of source systems 
• Is built focused on a dimensional 
model using a star schema
Reasons for creating a data mart 
• Easy access to frequently needed data 
• Creates collective view by a group of users 
• Improves end-user response time 
• Ease of creation 
• Lower cost than implementing a full data warehouse 
• Potential users are more clearly defined than in a full data warehouse 
• Contains only business essential data and is less cluttered.
Advantages & Disadvantages of data warehousing 
Advantages 
Enhances end-user access to a wide variety of data. 
 Increases data consistency. 
Increases productivity and decreases computing costs. 
 Is able to combine data from different sources, in one place. 
 It provides an infrastructure that could support changes to data and replication of the changed data 
back into the operational systems. 
Disadvantages 
 Extracting, cleaning and loading data could be time consuming. 
 Problems with compatibility with systems already in place e.g. transaction processing system. 
 Providing training to end-users, who end up not using the data warehouse. 
 Security could develop into a serious issue, especially if the data warehouse is web accessible.
Applications of data warehousing 
Industry Application 
Finance Credit card Analysis 
Insurance Claims, Fraud Analysis 
Telecommunication Call record Analysis 
Transport Logistics management 
Consumer goods Promotion Analysis
etl 
• Extract-Transform-Load 
• Responsible for the operations taking place in the backstage of data 
warehouse architecture. 
• Extract : Get the data from source system as efficiently as possible 
• Transform : Perform calculations on data 
• Load : Load the data in the target storage 
ADVANTAGES OF ETL TOOL 
Simple, faster and cheaper 
Deliver good performance even for very large data set 
Allows reuse of existing complex programs
Popular etl tools 
Tools Company 
Infomix IBM 
Oracle Warehouse Builder ORACLE 
Microsoft SQL Server Integration Microsoft
IBM Infomix 
• Informix is one of the world’s most widely used database servers 
• High levels of performance and availability, distinctive capabilities in data replication and 
scalability, and minimal administrative overhead. 
HIGHLIGHTS 
Real-time Analytics: Informix is a single platform that can power OLTP and OLAP workloads 
and successfully meet service-level agreements (SLAs) for each 
Fast, Always-on Transactions: Provides one of the industry’s widest sets of options for keeping 
data available at all times, including zero downtime for maintenance 
Sensor data management: Solves the big data challenge of sensor data with unmatched 
performance and scalability for managing time series data 
Easy to Use: Informix runs virtually unattended with self-configuring, self-managing and self-healing 
capabilities 
Best-of-breed embeddability: Provides a proven embedded data management platform for ISVs 
and OEMs to deliver integrated, world-class solutions, enabling platform independence 
NoSQL capability: 
IBM Informix unleashes new capabilities, giving you a way to combine unstructured and 
structured data in a smart way, bringing NoSQL to your SQL database.
conclusion 
Data Warehousing is not a new phenomenon. All large 
organizations already have data warehouses, but they are just not 
managing them. Over the next few years, the growth of data 
warehousing is going to be enormous with new products and 
technologies coming out frequently. In order to get the most out of this 
period, it is going to be important that data warehouse planners and 
developers have a clear idea of what they are looking for and then 
choose strategies and methods that will provide them with 
performance today and flexibility for tomorrow.
Reference 
1) Data Mining , Gupta 
2) Data Warehousing , C.S.R. Prabhu 
3) Jeff Lawyer and Shamsul Chowdhury “Best Practices in Data 
Warehousing to Support Business Initiatiatives and Needs”, IEEE 2004 
4) Ruilian Hou “Research and Analysis of Data Warehouse Technologies”, 
IEEE 2011 
5) S. Sai Sathyanarayana Reddy, Dr. L.S.S.Reddy, Dr.V.Khanna, A.Lavanya 
“Advanced Techniques for Scientific Data Warehousing”, IEEE 2009 
6) Murat Obali, Abdul Kadir Gorur, “A Real Time Data Warehouse 
Approach for Data Processing”, IEEE 2013 
7) Ruilian Hou “Analysis and research on the difference between data 
warehouse and database”, IEEE 2011
Questions ????
THANK YOU!!!!!

More Related Content

What's hot

Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEyad Manna
 
Data warehouse
Data warehouseData warehouse
Data warehouse
Medma Infomatix (P) Ltd.
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Anshika Nigam
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Shruti Dalela
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
DataminingTools Inc
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
Sunita Sahu
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
Alex Meadows
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
King Julian
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
Lovely Professional University
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
Fazle Rabbi Ador
 
Data warehouse
Data warehouseData warehouse
Data warehouse
Ramkrishna bhagat
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
obieefans
 
Data warehouse 21 snowflake schema
Data warehouse 21 snowflake schemaData warehouse 21 snowflake schema
Data warehouse 21 snowflake schema
Vaibhav Khanna
 
Data warehouse
Data warehouseData warehouse
Data warehouse
shachibattar
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
Sayed Ahmed
 

What's hot (20)

Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Ppt
PptPpt
Ppt
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data warehouse 21 snowflake schema
Data warehouse 21 snowflake schemaData warehouse 21 snowflake schema
Data warehouse 21 snowflake schema
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 

Similar to DATA WAREHOUSING

presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
vipush1
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data Visualisation
Sunderland City Council
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
ssuser7fc7eb
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
Philippe Julio
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Panchaleswar Nayak
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
Er. Nawaraj Bhandari
 
dataWarehouse.pptx
dataWarehouse.pptxdataWarehouse.pptx
dataWarehouse.pptx
hqlm1
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
SalehaMariyam
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
Murli Jha
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
AAKANKSHA JAIN
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
Kiran kumar
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
Walid Elbadawy
 
data warehousing
data warehousingdata warehousing
data warehousing
143sohil
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional Database
Russel Chowdhury
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
Christalin Nelson
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 

Similar to DATA WAREHOUSING (20)

presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data Visualisation
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
dataWarehouse.pptx
dataWarehouse.pptxdataWarehouse.pptx
dataWarehouse.pptx
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
 
Lecture1
Lecture1Lecture1
Lecture1
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional Database
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 

More from Rishikese MR

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Rishikese MR
 
Fuzzy Logic
Fuzzy LogicFuzzy Logic
Fuzzy Logic
Rishikese MR
 
Crowd Sourcing With Smart Phone
Crowd Sourcing With Smart PhoneCrowd Sourcing With Smart Phone
Crowd Sourcing With Smart Phone
Rishikese MR
 
BLUE BRAIN
BLUE BRAINBLUE BRAIN
BLUE BRAIN
Rishikese MR
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
Rishikese MR
 
CYBORG
CYBORG CYBORG
CYBORG
Rishikese MR
 
Automatic 2D to 3D Video Conversion For 3DTV's
 Automatic 2D to 3D Video Conversion For 3DTV's Automatic 2D to 3D Video Conversion For 3DTV's
Automatic 2D to 3D Video Conversion For 3DTV's
Rishikese MR
 
Middleware and Middleware in distributed application
Middleware and Middleware in distributed applicationMiddleware and Middleware in distributed application
Middleware and Middleware in distributed application
Rishikese MR
 
TOR NETWORK
TOR NETWORKTOR NETWORK
TOR NETWORK
Rishikese MR
 
EMOTION BASED COMPUTING
EMOTION BASED COMPUTINGEMOTION BASED COMPUTING
EMOTION BASED COMPUTING
Rishikese MR
 
BITCOIN TECHNOLOGY AND ITS USES
BITCOIN TECHNOLOGY AND ITS USESBITCOIN TECHNOLOGY AND ITS USES
BITCOIN TECHNOLOGY AND ITS USES
Rishikese MR
 
3D OPTICAL DATA STORAGE
3D OPTICAL DATA STORAGE3D OPTICAL DATA STORAGE
3D OPTICAL DATA STORAGE
Rishikese MR
 
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
Rishikese MR
 
Google Glass and its Features
Google Glass and its FeaturesGoogle Glass and its Features
Google Glass and its Features
Rishikese MR
 
Virtualization and cloud Computing
Virtualization and cloud ComputingVirtualization and cloud Computing
Virtualization and cloud Computing
Rishikese MR
 
Artificial intelligence in gaming.
Artificial intelligence in gaming.Artificial intelligence in gaming.
Artificial intelligence in gaming.
Rishikese MR
 
A seminar on neo4 j
A seminar on neo4 jA seminar on neo4 j
A seminar on neo4 j
Rishikese MR
 

More from Rishikese MR (19)

1 2 3 4 5 g
1 2 3 4 5 g1 2 3 4 5 g
1 2 3 4 5 g
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Fuzzy Logic
Fuzzy LogicFuzzy Logic
Fuzzy Logic
 
Crowd Sourcing With Smart Phone
Crowd Sourcing With Smart PhoneCrowd Sourcing With Smart Phone
Crowd Sourcing With Smart Phone
 
BLUE BRAIN
BLUE BRAINBLUE BRAIN
BLUE BRAIN
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
CYBORG
CYBORG CYBORG
CYBORG
 
Automatic 2D to 3D Video Conversion For 3DTV's
 Automatic 2D to 3D Video Conversion For 3DTV's Automatic 2D to 3D Video Conversion For 3DTV's
Automatic 2D to 3D Video Conversion For 3DTV's
 
Middleware and Middleware in distributed application
Middleware and Middleware in distributed applicationMiddleware and Middleware in distributed application
Middleware and Middleware in distributed application
 
TOR NETWORK
TOR NETWORKTOR NETWORK
TOR NETWORK
 
EMOTION BASED COMPUTING
EMOTION BASED COMPUTINGEMOTION BASED COMPUTING
EMOTION BASED COMPUTING
 
BITCOIN TECHNOLOGY AND ITS USES
BITCOIN TECHNOLOGY AND ITS USESBITCOIN TECHNOLOGY AND ITS USES
BITCOIN TECHNOLOGY AND ITS USES
 
3D OPTICAL DATA STORAGE
3D OPTICAL DATA STORAGE3D OPTICAL DATA STORAGE
3D OPTICAL DATA STORAGE
 
OUTERNET
OUTERNETOUTERNET
OUTERNET
 
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
 
Google Glass and its Features
Google Glass and its FeaturesGoogle Glass and its Features
Google Glass and its Features
 
Virtualization and cloud Computing
Virtualization and cloud ComputingVirtualization and cloud Computing
Virtualization and cloud Computing
 
Artificial intelligence in gaming.
Artificial intelligence in gaming.Artificial intelligence in gaming.
Artificial intelligence in gaming.
 
A seminar on neo4 j
A seminar on neo4 jA seminar on neo4 j
A seminar on neo4 j
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 

DATA WAREHOUSING

  • 2. Presented By, Neenu C. Paul(12120051) CS B, S7 SOE, CUSAT Guided By, Dr. Sudheep Elayidom Division of Computer Science SOE, CUSAT
  • 3. CONTENTS • What is a data warehouse? • What is data warehousing? • Database vs Data warehouse • OLTP & OLAP • Data warehouse architecture • Multidimensional data model • Data Mart • ETL • Advantages of data warehouse • Disadvantages of data warehouse • S/W Solutions of data warehouse • Conclusion • References
  • 4. A producer wants to know…. Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? What is the most effective distribution channel? What product prom- -otions have the biggest impact on revenue? What impact will new products/services have on revenue and margins? Which customers are most likely to go to the competition ?
  • 5. What is a Data Warehouse?? • A data warehouse is an appliance for storing and analyzing data, and reporting. • Central database that includes information from several different sources. • Keeps current as well as historical data. • Used to produce reports to assist in decision-making and management.
  • 6. “Data Warehouse is a subject oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.” –W. H. Inmon Subject Oriented Data Warehouse Integrated Time Variant Non-volatile
  • 7. What is Data Warehousing? A process of transforming data into information and making it available to users in a timely enough manner to make a difference Data Information
  • 8. Database vs Data Warehouse Database • Transaction Oriented • For saving online bargain data • E-R modeling techniques are used for designing • Capture data • Constitute real time information Data Warehouse • Subject oriented • For saving historical data • Data modeling techniques are used for designing. • Analyze data • Constitute entire information base for all time.
  • 9. Data Processing Technologies • OLTP (on-line transaction processing) - The major task is to perform on-line transaction and query processing. Covers most of the day-to-day operations of an organization. • OLAP(On-Line Analytical Processing) - Serve knowledge workers(users) in the role of data analysis and decision making. - Organize and present data in various formats to accommodate the diverse needs of the different users. Data Processing Technologies OLTP OLAP
  • 10. OLTP vs OLAP OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write dozens of records Millions of record read unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 100MB-GB 100GB-TB
  • 11. 11 October 31, 2014 To summarize ...  OLTP Systems are used to “run” a business  The Data Warehouse helps to “optimize” the business
  • 12. Typical DW Architecture Data Sources ETL Data Store Data Access Presentation System A System B System C System D Extract Transform Load The Data Warehouse Business Model Dashboards Prompted Views Scorecards Ad-Hoc Reporting Self Serve 12
  • 13. Multidimensional data model • Developed for implementing data warehouse and data marts. • Provides both a mechanism to store data and a way for business analysis. • An alternative to entity-relationship (E/R) model TYPES OF MULTIDIMENSIONAL DATA MODEL Data cube model. Star schema model. Snow flake schema model. Fact Constellations.
  • 14. Data cubes • A data warehouse is based on a multidimensional data model which views data in the form of a data cube. • Three important concepts are associated with data cubes - Slicing - Dicing - Rotating •In the cube given below we have the results of the 1991 Canadian Census with ethnic origin, age group and geography representing the dimensions of the cube, while 174 represents the measure. The dimension is a category of data. Each dimension includes different levels of categories. The measures are actual data values that occupy the cells as defined by the dimensions selected.
  • 16. Slicing the Data Cube • Figure 2 illustrates slicing the Ethnic origin Chinese. When the cube is sliced like in this example, we are able to generate data for Chinese origin for the geography and age groups as a result. • The data that is contained within the cube has effectively been filtered in order to display the measures associated only with the Chinese ethnic origin. • From an end user perspective, the term slice most often refers to a two- dimensional page selected from the cube. 16
  • 17. Dicing and Rotating • Dicing is a related operation to slicing in which a sub-cube of the original space is defined • Dicing provides the user with the smallest available slice of data, enabling you to examine each sub-cube in greater detail. • Rotating, which is sometimes called pivoting changes the dimensional orientation of the report or page display from the cube data. Rotating may consist of swapping the rows an columns, or moving one of the row dimensions into the column dimension. 17
  • 18. Data Mart • Contains a subset of the data stored in the data warehouse that is of interest to a specific business community, department, or set of users. • E.g.: Marketing promotions, finance ,or account collections. • Data marts are small slices of the data warehouse. • Data marts improve end-user response time by allowing users to have access to the specific type of data they need to view. • A data mart is basically a condensed and more focused version of a data warehouse.
  • 19. Data warehouse vs Data mart DATA WAREHOUSE • Holds multiple subject areas • Holds very detailed information • Works to integrate all data sources • Does not necessarily use a dimensional model but feeds dimensional models DATA MART • Often holds only one subject area-for example, Finance, or Sales • May hold more summarized data (although many hold full detail) • Concentrates on integrating information from a given subject area or set of source systems • Is built focused on a dimensional model using a star schema
  • 20. Reasons for creating a data mart • Easy access to frequently needed data • Creates collective view by a group of users • Improves end-user response time • Ease of creation • Lower cost than implementing a full data warehouse • Potential users are more clearly defined than in a full data warehouse • Contains only business essential data and is less cluttered.
  • 21. Advantages & Disadvantages of data warehousing Advantages Enhances end-user access to a wide variety of data.  Increases data consistency. Increases productivity and decreases computing costs.  Is able to combine data from different sources, in one place.  It provides an infrastructure that could support changes to data and replication of the changed data back into the operational systems. Disadvantages  Extracting, cleaning and loading data could be time consuming.  Problems with compatibility with systems already in place e.g. transaction processing system.  Providing training to end-users, who end up not using the data warehouse.  Security could develop into a serious issue, especially if the data warehouse is web accessible.
  • 22. Applications of data warehousing Industry Application Finance Credit card Analysis Insurance Claims, Fraud Analysis Telecommunication Call record Analysis Transport Logistics management Consumer goods Promotion Analysis
  • 23. etl • Extract-Transform-Load • Responsible for the operations taking place in the backstage of data warehouse architecture. • Extract : Get the data from source system as efficiently as possible • Transform : Perform calculations on data • Load : Load the data in the target storage ADVANTAGES OF ETL TOOL Simple, faster and cheaper Deliver good performance even for very large data set Allows reuse of existing complex programs
  • 24. Popular etl tools Tools Company Infomix IBM Oracle Warehouse Builder ORACLE Microsoft SQL Server Integration Microsoft
  • 25. IBM Infomix • Informix is one of the world’s most widely used database servers • High levels of performance and availability, distinctive capabilities in data replication and scalability, and minimal administrative overhead. HIGHLIGHTS Real-time Analytics: Informix is a single platform that can power OLTP and OLAP workloads and successfully meet service-level agreements (SLAs) for each Fast, Always-on Transactions: Provides one of the industry’s widest sets of options for keeping data available at all times, including zero downtime for maintenance Sensor data management: Solves the big data challenge of sensor data with unmatched performance and scalability for managing time series data Easy to Use: Informix runs virtually unattended with self-configuring, self-managing and self-healing capabilities Best-of-breed embeddability: Provides a proven embedded data management platform for ISVs and OEMs to deliver integrated, world-class solutions, enabling platform independence NoSQL capability: IBM Informix unleashes new capabilities, giving you a way to combine unstructured and structured data in a smart way, bringing NoSQL to your SQL database.
  • 26. conclusion Data Warehousing is not a new phenomenon. All large organizations already have data warehouses, but they are just not managing them. Over the next few years, the growth of data warehousing is going to be enormous with new products and technologies coming out frequently. In order to get the most out of this period, it is going to be important that data warehouse planners and developers have a clear idea of what they are looking for and then choose strategies and methods that will provide them with performance today and flexibility for tomorrow.
  • 27. Reference 1) Data Mining , Gupta 2) Data Warehousing , C.S.R. Prabhu 3) Jeff Lawyer and Shamsul Chowdhury “Best Practices in Data Warehousing to Support Business Initiatiatives and Needs”, IEEE 2004 4) Ruilian Hou “Research and Analysis of Data Warehouse Technologies”, IEEE 2011 5) S. Sai Sathyanarayana Reddy, Dr. L.S.S.Reddy, Dr.V.Khanna, A.Lavanya “Advanced Techniques for Scientific Data Warehousing”, IEEE 2009 6) Murat Obali, Abdul Kadir Gorur, “A Real Time Data Warehouse Approach for Data Processing”, IEEE 2013 7) Ruilian Hou “Analysis and research on the difference between data warehouse and database”, IEEE 2011