SlideShare a Scribd company logo
1 of 20
Building The Data Warehouse
by Inmon


 Chapter 4: Granularity in the Data Warehouse




                         http://it-slideshares.blogspot.com/
4.0 Introduce - Granularity in the Data Warehouse



  Determining   the proper level of granularity
   of the data that will reside in the data
   warehouse.
  Granularity is important to the warehouse
   architect because it affects all the
   environments that depend on the warehouse
   for data.
4.1 Raw Estimates
  The raw estimate of the number of rows of data that will reside
  in the data warehouse tells the architect a great deal.
4.2 Input to the Planning Process
  The estimate of rows and DASD then serves as input
  to the planning process
4.3 Data in Overflow


Compare the total number of rows in the warehouse environment:
4.3 Data in Overflow (ct)
     There  will be more expertise available in
      managing the data warehouse volumes of data.
     Hardware costs will have dropped to some
      extent.
     More powerful software tools will be available.
     The end user will be more sophisticated.
4.3.1 Overflow Storage
4.3.1 Overflow Storage (ct)
4.4 What the Levels of Granularity Will Be
4.5 Some Feedback Loop Techniques
   Following are techniques to make the feedback
     loop harmonious:
   Build the first parts of the data warehouse in
     very small, very fast steps, and carefully listen to
     the end users’ comments at the end of each
     step of development. Be prepared to make
     adjustments quickly.
   If available, use prototyping and allow the
     feedback loop to function using observations
     gleaned from the prototype.
4.5 Some Feedback Loop Techniques (ct)

   Look  at how other people have built their levels of
    granularity and learn from their experience.
   Go through the feedback process with an experienced user
    who is aware of the process occurring. Under no
    circumstances should you keep your users in the dark as to
    the dynamics of the feedback loop.
   Look at whatever the organization has now that appears to
    be working, and use those functional requirements as a
    guideline.
   Execute joint application design (JAD) sessions and simulate
    the output to achieve the desired feedback.
4.5 Some Feedback Loop Techniques (ct)

  Granularity of data can be raised in many ways, such as the
    following:
   Summarize data from the source as it goes into the target.
   Average or otherwise calculate data as it goes into the
    target.
   Push highest and/or lowest set values into the target.
   Push only data that is obviously needed into the target.
   Use conditional logic to select only a subset of records to
    go into the target.
4.6 Levels of Granularity—Banking Environment
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.7 Feeding the Data Marts



  Specification level of granularity the data
 marts will need.

  The data that resides in the data warehouse
 must be at the lowest level of granularity
 needed by any of the data marts.
4.8 Summary


      Choosing the proper levels of granularity for the architected
       environment is vital to success.
      The worst stance that can be taken is to design all the levels of
       granularity a priori, and then build the data warehouse.
      The process of granularity design begins with a raw estimate of how
       large the warehouse will be on the one-year and the five-year
       horizon.
      There is an important feedback loop for the data warehouse
       environment.
      Another important consideration is the levels of granularity needed
       by the different architectural components that will be fed from the
       data warehouse.



                                   http://it-slideshares.blogspot.com/

More Related Content

What's hot

Distributed database system
Distributed database systemDistributed database system
Distributed database systemM. Ahmad Mahmood
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
CDW: SAN vs. NAS
CDW: SAN vs. NASCDW: SAN vs. NAS
CDW: SAN vs. NASSpiceworks
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...ScyllaDB
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architectureMatteo Merli
 
Difference between Homogeneous and Heterogeneous
Difference between Homogeneous  and    HeterogeneousDifference between Homogeneous  and    Heterogeneous
Difference between Homogeneous and HeterogeneousFaraz Qaisrani
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationDataminingTools Inc
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 

What's hot (20)

NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
 
Distributed database system
Distributed database systemDistributed database system
Distributed database system
 
HBase
HBaseHBase
HBase
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Analysis vs reporting
Analysis vs reportingAnalysis vs reporting
Analysis vs reporting
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Data Mining
Data MiningData Mining
Data Mining
 
CDW: SAN vs. NAS
CDW: SAN vs. NASCDW: SAN vs. NAS
CDW: SAN vs. NAS
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architecture
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Difference between Homogeneous and Heterogeneous
Difference between Homogeneous  and    HeterogeneousDifference between Homogeneous  and    Heterogeneous
Difference between Homogeneous and Heterogeneous
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalization
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 

Viewers also liked

Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design phanleson
 
Data Warehousing Datamining Concepts
Data Warehousing Datamining ConceptsData Warehousing Datamining Concepts
Data Warehousing Datamining Conceptsraulmisir
 
Lecture 02 - The Data Warehouse Environment
Lecture 02 - The Data Warehouse Environment Lecture 02 - The Data Warehouse Environment
Lecture 02 - The Data Warehouse Environment phanleson
 
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...Spark Summit
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonCaserta
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Salah Amean
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouseJ M
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation RoadmapDavid Walker
 
Sample - Data Warehouse Requirements
Sample -  Data Warehouse RequirementsSample -  Data Warehouse Requirements
Sample - Data Warehouse RequirementsDavid Walker
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Spark Summit
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Health Catalyst
 

Viewers also liked (20)

Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design
 
Data Warehousing Datamining Concepts
Data Warehousing Datamining ConceptsData Warehousing Datamining Concepts
Data Warehousing Datamining Concepts
 
Lecture 02 - The Data Warehouse Environment
Lecture 02 - The Data Warehouse Environment Lecture 02 - The Data Warehouse Environment
Lecture 02 - The Data Warehouse Environment
 
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Big Data Profiling
Big Data Profiling Big Data Profiling
Big Data Profiling
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
 
Sample - Data Warehouse Requirements
Sample -  Data Warehouse RequirementsSample -  Data Warehouse Requirements
Sample - Data Warehouse Requirements
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
 
Data mining
Data miningData mining
Data mining
 

Similar to Lecture 04 - Granularity in the Data Warehouse

Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarellitruongthuthuy47
 
Impact of cloud services on software development life
Impact of cloud services on software development life Impact of cloud services on software development life
Impact of cloud services on software development life Mohamed M. Yazji
 
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...Sunbird DCIM
 
IOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperIOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperDavid Walker
 
Data Gaurd Final Thesis for University in Progress (2).docx
Data Gaurd Final Thesis for University in Progress (2).docxData Gaurd Final Thesis for University in Progress (2).docx
Data Gaurd Final Thesis for University in Progress (2).docxMohdKashif82
 
Webinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage RefreshWebinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage RefreshStorage Switzerland
 
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...Precisely
 
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)Dealmaker Media
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Coursejimliddle
 
ODW 2021 - Automated patching and compliance to improve database security.pptx
ODW 2021 - Automated patching and compliance to improve database security.pptxODW 2021 - Automated patching and compliance to improve database security.pptx
ODW 2021 - Automated patching and compliance to improve database security.pptxPaul Breniuc
 
Designing a Framework to Standardize Data Warehouse Development Process for E...
Designing a Framework to Standardize Data Warehouse Development Process for E...Designing a Framework to Standardize Data Warehouse Development Process for E...
Designing a Framework to Standardize Data Warehouse Development Process for E...ijdms
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Iod session 3423   analytics patterns of expertise, the fast path to amazing ...Iod session 3423   analytics patterns of expertise, the fast path to amazing ...
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...Rachel Bland
 
The Total Economic Impact Of NetApp Storage For Desktop Virtualization
The Total Economic Impact Of NetApp Storage For Desktop VirtualizationThe Total Economic Impact Of NetApp Storage For Desktop Virtualization
The Total Economic Impact Of NetApp Storage For Desktop VirtualizationNetApp
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 

Similar to Lecture 04 - Granularity in the Data Warehouse (20)

Data mining
Data miningData mining
Data mining
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli
 
Impact of cloud services on software development life
Impact of cloud services on software development life Impact of cloud services on software development life
Impact of cloud services on software development life
 
Unit 4.pptx
Unit 4.pptxUnit 4.pptx
Unit 4.pptx
 
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...
DCIM Software Five Years Later: What I Wish I Had Known When I Started (Case ...
 
IOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperIOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - Paper
 
Data Gaurd Final Thesis for University in Progress (2).docx
Data Gaurd Final Thesis for University in Progress (2).docxData Gaurd Final Thesis for University in Progress (2).docx
Data Gaurd Final Thesis for University in Progress (2).docx
 
Webinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage RefreshWebinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage Refresh
 
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...
Engineering Machine Learning Data Pipelines Series: Tracking Data Lineage fro...
 
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)
Scalability for Startups (Frank Mashraqi, Startonomics SF 2008)
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Course
 
ODW 2021 - Automated patching and compliance to improve database security.pptx
ODW 2021 - Automated patching and compliance to improve database security.pptxODW 2021 - Automated patching and compliance to improve database security.pptx
ODW 2021 - Automated patching and compliance to improve database security.pptx
 
Designing a Framework to Standardize Data Warehouse Development Process for E...
Designing a Framework to Standardize Data Warehouse Development Process for E...Designing a Framework to Standardize Data Warehouse Development Process for E...
Designing a Framework to Standardize Data Warehouse Development Process for E...
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Iod session 3423   analytics patterns of expertise, the fast path to amazing ...Iod session 3423   analytics patterns of expertise, the fast path to amazing ...
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
 
The Total Economic Impact Of NetApp Storage For Desktop Virtualization
The Total Economic Impact Of NetApp Storage For Desktop VirtualizationThe Total Economic Impact Of NetApp Storage For Desktop Virtualization
The Total Economic Impact Of NetApp Storage For Desktop Virtualization
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 

More from phanleson

Learning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with SparkLearning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with Sparkphanleson
 
Firewall - Network Defense in Depth Firewalls
Firewall - Network Defense in Depth FirewallsFirewall - Network Defense in Depth Firewalls
Firewall - Network Defense in Depth Firewallsphanleson
 
Mobile Security - Wireless hacking
Mobile Security - Wireless hackingMobile Security - Wireless hacking
Mobile Security - Wireless hackingphanleson
 
Authentication in wireless - Security in Wireless Protocols
Authentication in wireless - Security in Wireless ProtocolsAuthentication in wireless - Security in Wireless Protocols
Authentication in wireless - Security in Wireless Protocolsphanleson
 
E-Commerce Security - Application attacks - Server Attacks
E-Commerce Security - Application attacks - Server AttacksE-Commerce Security - Application attacks - Server Attacks
E-Commerce Security - Application attacks - Server Attacksphanleson
 
Hacking web applications
Hacking web applicationsHacking web applications
Hacking web applicationsphanleson
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designphanleson
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operationsphanleson
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBasephanleson
 
Learning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibLearning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibphanleson
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streamingphanleson
 
Learning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLLearning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLphanleson
 
Learning spark ch07 - Running on a Cluster
Learning spark ch07 - Running on a ClusterLearning spark ch07 - Running on a Cluster
Learning spark ch07 - Running on a Clusterphanleson
 
Learning spark ch06 - Advanced Spark Programming
Learning spark ch06 - Advanced Spark ProgrammingLearning spark ch06 - Advanced Spark Programming
Learning spark ch06 - Advanced Spark Programmingphanleson
 
Learning spark ch05 - Loading and Saving Your Data
Learning spark ch05 - Loading and Saving Your DataLearning spark ch05 - Loading and Saving Your Data
Learning spark ch05 - Loading and Saving Your Dataphanleson
 
Learning spark ch04 - Working with Key/Value Pairs
Learning spark ch04 - Working with Key/Value PairsLearning spark ch04 - Working with Key/Value Pairs
Learning spark ch04 - Working with Key/Value Pairsphanleson
 
Learning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with SparkLearning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with Sparkphanleson
 
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about LibertagiaHướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagiaphanleson
 
Lecture 1 - Getting to know XML
Lecture 1 - Getting to know XMLLecture 1 - Getting to know XML
Lecture 1 - Getting to know XMLphanleson
 
Lecture 4 - Adding XTHML for the Web
Lecture  4 - Adding XTHML for the WebLecture  4 - Adding XTHML for the Web
Lecture 4 - Adding XTHML for the Webphanleson
 

More from phanleson (20)

Learning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with SparkLearning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with Spark
 
Firewall - Network Defense in Depth Firewalls
Firewall - Network Defense in Depth FirewallsFirewall - Network Defense in Depth Firewalls
Firewall - Network Defense in Depth Firewalls
 
Mobile Security - Wireless hacking
Mobile Security - Wireless hackingMobile Security - Wireless hacking
Mobile Security - Wireless hacking
 
Authentication in wireless - Security in Wireless Protocols
Authentication in wireless - Security in Wireless ProtocolsAuthentication in wireless - Security in Wireless Protocols
Authentication in wireless - Security in Wireless Protocols
 
E-Commerce Security - Application attacks - Server Attacks
E-Commerce Security - Application attacks - Server AttacksE-Commerce Security - Application attacks - Server Attacks
E-Commerce Security - Application attacks - Server Attacks
 
Hacking web applications
Hacking web applicationsHacking web applications
Hacking web applications
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table design
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operations
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBase
 
Learning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibLearning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlib
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streaming
 
Learning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLLearning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQL
 
Learning spark ch07 - Running on a Cluster
Learning spark ch07 - Running on a ClusterLearning spark ch07 - Running on a Cluster
Learning spark ch07 - Running on a Cluster
 
Learning spark ch06 - Advanced Spark Programming
Learning spark ch06 - Advanced Spark ProgrammingLearning spark ch06 - Advanced Spark Programming
Learning spark ch06 - Advanced Spark Programming
 
Learning spark ch05 - Loading and Saving Your Data
Learning spark ch05 - Loading and Saving Your DataLearning spark ch05 - Loading and Saving Your Data
Learning spark ch05 - Loading and Saving Your Data
 
Learning spark ch04 - Working with Key/Value Pairs
Learning spark ch04 - Working with Key/Value PairsLearning spark ch04 - Working with Key/Value Pairs
Learning spark ch04 - Working with Key/Value Pairs
 
Learning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with SparkLearning spark ch01 - Introduction to Data Analysis with Spark
Learning spark ch01 - Introduction to Data Analysis with Spark
 
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about LibertagiaHướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
 
Lecture 1 - Getting to know XML
Lecture 1 - Getting to know XMLLecture 1 - Getting to know XML
Lecture 1 - Getting to know XML
 
Lecture 4 - Adding XTHML for the Web
Lecture  4 - Adding XTHML for the WebLecture  4 - Adding XTHML for the Web
Lecture 4 - Adding XTHML for the Web
 

Recently uploaded

Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 

Recently uploaded (20)

Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Lecture 04 - Granularity in the Data Warehouse

  • 1. Building The Data Warehouse by Inmon Chapter 4: Granularity in the Data Warehouse http://it-slideshares.blogspot.com/
  • 2. 4.0 Introduce - Granularity in the Data Warehouse Determining the proper level of granularity of the data that will reside in the data warehouse. Granularity is important to the warehouse architect because it affects all the environments that depend on the warehouse for data.
  • 3. 4.1 Raw Estimates The raw estimate of the number of rows of data that will reside in the data warehouse tells the architect a great deal.
  • 4. 4.2 Input to the Planning Process The estimate of rows and DASD then serves as input to the planning process
  • 5. 4.3 Data in Overflow Compare the total number of rows in the warehouse environment:
  • 6. 4.3 Data in Overflow (ct) There will be more expertise available in managing the data warehouse volumes of data. Hardware costs will have dropped to some extent. More powerful software tools will be available. The end user will be more sophisticated.
  • 9. 4.4 What the Levels of Granularity Will Be
  • 10. 4.5 Some Feedback Loop Techniques Following are techniques to make the feedback loop harmonious: Build the first parts of the data warehouse in very small, very fast steps, and carefully listen to the end users’ comments at the end of each step of development. Be prepared to make adjustments quickly. If available, use prototyping and allow the feedback loop to function using observations gleaned from the prototype.
  • 11. 4.5 Some Feedback Loop Techniques (ct)  Look at how other people have built their levels of granularity and learn from their experience.  Go through the feedback process with an experienced user who is aware of the process occurring. Under no circumstances should you keep your users in the dark as to the dynamics of the feedback loop.  Look at whatever the organization has now that appears to be working, and use those functional requirements as a guideline.  Execute joint application design (JAD) sessions and simulate the output to achieve the desired feedback.
  • 12. 4.5 Some Feedback Loop Techniques (ct) Granularity of data can be raised in many ways, such as the following:  Summarize data from the source as it goes into the target.  Average or otherwise calculate data as it goes into the target.  Push highest and/or lowest set values into the target.  Push only data that is obviously needed into the target.  Use conditional logic to select only a subset of records to go into the target.
  • 13. 4.6 Levels of Granularity—Banking Environment
  • 14. 4.6 Levels of Granularity—Banking Environment (ct)
  • 15. 4.6 Levels of Granularity—Banking Environment (ct)
  • 16. 4.6 Levels of Granularity—Banking Environment (ct)
  • 17. 4.6 Levels of Granularity—Banking Environment (ct)
  • 18. 4.6 Levels of Granularity—Banking Environment (ct)
  • 19. 4.7 Feeding the Data Marts  Specification level of granularity the data marts will need.  The data that resides in the data warehouse must be at the lowest level of granularity needed by any of the data marts.
  • 20. 4.8 Summary  Choosing the proper levels of granularity for the architected environment is vital to success.  The worst stance that can be taken is to design all the levels of granularity a priori, and then build the data warehouse.  The process of granularity design begins with a raw estimate of how large the warehouse will be on the one-year and the five-year horizon.  There is an important feedback loop for the data warehouse environment.  Another important consideration is the levels of granularity needed by the different architectural components that will be fed from the data warehouse. http://it-slideshares.blogspot.com/