SlideShare a Scribd company logo
By: RAVI RANJAN




                  DATA
              WAREHOUSE
                  By: Ravi Ranjan
DEFINITION
 Data Warehouse
 A collection of corporate
 information, derived directly
 from operational systems
 and some external data
 sources. Its specific purpose
 is to support business
 decisions, not business
 operations.
THE PURPOSE OF DATA WAREHOUSING

     Realize    the value of data
         Data / information is an asset
         Methods to realize the value, (Reporting, Analysis,
          etc.)


     Make    better decisions
          Turn data into information
          Create competitive advantage
          Methods to support the decision making process,
          (EIS, DSS, etc.)
Data Warehouse Components

• Staging Area
      • A preparatory repository where transaction data
        can be transformed for use in the data warehouse
• Data Mart
      • Traditional dimensionally modeled set of dimension
        and fact tables
      • Per Kimball, a data warehouse is the union of a set
        of data marts
• Operational Data Store (ODS)
      • Modeled to support near real-time reporting needs.
DATA WAREHOUSE FUNCTIONALITY


Relational
Databases
                            Optimized Loader
               Extraction
ERP
Systems        Cleansing
                            Data Warehouse
                            Engine         Analyze
Purchased                                      Query
Data



Legacy
Data            Metadata Repository
EVOLUTION ARCHITECTURE OF DATA WAREHOUSE


                                      GO TO
 Top-Down Architecture               DIAGRAM

                                      GO TO
 Bottom-Up Architecture              DIAGRAM

                                      GO TO
 Enterprise Data Mart Architecture   DIAGRAM

                                      GO TO
 Data Stage/Data Mart Architecture   DIAGRAM
VERY LARGE DATA BASES

  WAREHOUSES ARE VERY LARGE DATABASES

 Terabytes   -- 10^12 bytes: Wal-Mart -- 24 Terabytes

 Petabytes -- 10^15 bytes: Geographic Information
                             Systems
 Exabytes -- 10^18 bytes:  National Medical Records

 Zettabytes   -- 10^21 bytes: Weather images

 Zottabytes   -- 10^24 bytes: Intelligence Agency Videos
COMPLEXITIES OF CREATING A DATA WAREHOUSE

     Incomplete errors
        Missing Fields
        Records or Fields That, by Design, are not
         Being Recorded

     Incorrecterrors
        Wrong Calculations, Aggregations
        Duplicate Records
        Wrong Information Entered into Source
         System
SUCCESS & FUTURE OF DATA WAREHOUSE

 The    Data Warehouse has successfully supported the
    increased needs of the State over the past eight years.
   The need for growth continues however, as the desire for
    more integrated data increases.
 The   Data Warehouse has software and tools in place to
    provide the functionality needed to support new
    enterprise Data Warehouse projects.
 The   future capabilities of the Data Warehouse can be
    expanded to include other programs and agencies.
DATA WAREHOUSE PITFALLS


 Youare going to spend much time extracting, cleaning,
 and loading data
 Youare going to find problems with systems feeding the
 data warehouse
 Youwill find the need to store/validate data not being
 captured/validated by any existing system
 Large scale data warehousing can become an exercise
 in data homogenizing
DATA WAREHOUSE PITFALLS…

 The  time it takes to load the warehouse will expand
  to the amount of the time in the available window...
  and then some
 You are building a HIGH maintenance system

 You will fail if you concentrate on resource
  optimization to the neglect of project, data, and
  customer management issues and an understanding
  of what adds value to the customer
BEST PRACTICES


 Complete     requirements and design

 Prototyping    is key to business understanding

 Utilizing   proper aggregations and detailed data

 Training    is an on-going process

 Build   data integrity checks into your system.
Top-Down Architecture




                      BACK TO
                    ARCHITECTURE
Bottom-Up Architecture




                           BACK TO
                         ARCHITECTURE
Enterprise Data Mart Architecture




                                 BACK TO
                               ARCHITECTURE
Data Stage/Data Mart Architecture




                                BACK TO
                              ARCHITECTURE
142230 633685297550892500

More Related Content

What's hot

Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
Er. Nawaraj Bhandari
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural Framework
Dr. Sunil Kr. Pandey
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
Abhi Bhardwaj
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
Lovely Professional University
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
FellowBuddy.com
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
maxonlinetr
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
Theju Paul
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
Amr Abd El Latief
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
neelamoberoi1030
 
Data mining
Data miningData mining
Data mining
Samir Sabry
 
introduction to data warehousing and mining
 introduction to data warehousing and mining introduction to data warehousing and mining
introduction to data warehousing and mining
Rajesh Chandra
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
Satya P. Joshi
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
umesh patil
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
MadhuriNigam1
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
Shahed Khalili
 
Star schema
Star schemaStar schema
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
AkhilSinghal21
 
Data ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housingData ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housing
Vibrant Technologies & Computers
 

What's hot (20)

Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural Framework
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data mining
Data miningData mining
Data mining
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Data mining
Data miningData mining
Data mining
 
introduction to data warehousing and mining
 introduction to data warehousing and mining introduction to data warehousing and mining
introduction to data warehousing and mining
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Star schema
Star schemaStar schema
Star schema
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
 
Data ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housingData ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housing
 

Similar to 142230 633685297550892500

Data warehousing
Data warehousingData warehousing
Data warehousing
Varun Jain
 
Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
oracle content
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
DataminingTools Inc
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Juhi Mahajan
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
RTTS
 
DWBASIC.ppt
DWBASIC.pptDWBASIC.ppt
DWBASIC.ppt
ssuserc65885
 
Dwh basics datastage online training
Dwh basics datastage online trainingDwh basics datastage online training
Dwh basics datastage online training
Datawarehouse Trainings
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
Datawarehouse Trainings
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Caserta
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
dataware
 
professional informatica trainer
professional informatica trainerprofessional informatica trainer
professional informatica trainer
vibrantuser
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
RTTS
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
Mark Kromer
 
DW 101
DW 101DW 101
DW 101
jeffd00
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
Craig Jordan
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Amazon Web Services LATAM
 

Similar to 142230 633685297550892500 (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
DWBASIC.ppt
DWBASIC.pptDWBASIC.ppt
DWBASIC.ppt
 
Dwh basics datastage online training
Dwh basics datastage online trainingDwh basics datastage online training
Dwh basics datastage online training
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
 
professional informatica trainer
professional informatica trainerprofessional informatica trainer
professional informatica trainer
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
DW 101
DW 101DW 101
DW 101
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 

More from sumit621

Lect4
Lect4Lect4
Lect4
sumit621
 
Chapter16
Chapter16Chapter16
Chapter16
sumit621
 
Chap05
Chap05Chap05
Chap05
sumit621
 
90300 633579030311875000
90300 63357903031187500090300 633579030311875000
90300 633579030311875000
sumit621
 
01 intro
01 intro01 intro
01 intro
sumit621
 
Talk
TalkTalk
Talk
sumit621
 

More from sumit621 (6)

Lect4
Lect4Lect4
Lect4
 
Chapter16
Chapter16Chapter16
Chapter16
 
Chap05
Chap05Chap05
Chap05
 
90300 633579030311875000
90300 63357903031187500090300 633579030311875000
90300 633579030311875000
 
01 intro
01 intro01 intro
01 intro
 
Talk
TalkTalk
Talk
 

Recently uploaded

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 

Recently uploaded (20)

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 

142230 633685297550892500

  • 1. By: RAVI RANJAN DATA WAREHOUSE By: Ravi Ranjan
  • 2. DEFINITION Data Warehouse A collection of corporate information, derived directly from operational systems and some external data sources. Its specific purpose is to support business decisions, not business operations.
  • 3. THE PURPOSE OF DATA WAREHOUSING  Realize the value of data  Data / information is an asset  Methods to realize the value, (Reporting, Analysis, etc.)  Make better decisions  Turn data into information  Create competitive advantage  Methods to support the decision making process, (EIS, DSS, etc.)
  • 4. Data Warehouse Components • Staging Area • A preparatory repository where transaction data can be transformed for use in the data warehouse • Data Mart • Traditional dimensionally modeled set of dimension and fact tables • Per Kimball, a data warehouse is the union of a set of data marts • Operational Data Store (ODS) • Modeled to support near real-time reporting needs.
  • 5. DATA WAREHOUSE FUNCTIONALITY Relational Databases Optimized Loader Extraction ERP Systems Cleansing Data Warehouse Engine Analyze Purchased Query Data Legacy Data Metadata Repository
  • 6. EVOLUTION ARCHITECTURE OF DATA WAREHOUSE GO TO Top-Down Architecture DIAGRAM GO TO Bottom-Up Architecture DIAGRAM GO TO Enterprise Data Mart Architecture DIAGRAM GO TO Data Stage/Data Mart Architecture DIAGRAM
  • 7. VERY LARGE DATA BASES WAREHOUSES ARE VERY LARGE DATABASES  Terabytes -- 10^12 bytes: Wal-Mart -- 24 Terabytes  Petabytes -- 10^15 bytes: Geographic Information Systems  Exabytes -- 10^18 bytes: National Medical Records  Zettabytes -- 10^21 bytes: Weather images  Zottabytes -- 10^24 bytes: Intelligence Agency Videos
  • 8. COMPLEXITIES OF CREATING A DATA WAREHOUSE  Incomplete errors  Missing Fields  Records or Fields That, by Design, are not Being Recorded  Incorrecterrors  Wrong Calculations, Aggregations  Duplicate Records  Wrong Information Entered into Source System
  • 9. SUCCESS & FUTURE OF DATA WAREHOUSE  The Data Warehouse has successfully supported the increased needs of the State over the past eight years.  The need for growth continues however, as the desire for more integrated data increases.  The Data Warehouse has software and tools in place to provide the functionality needed to support new enterprise Data Warehouse projects.  The future capabilities of the Data Warehouse can be expanded to include other programs and agencies.
  • 10. DATA WAREHOUSE PITFALLS  Youare going to spend much time extracting, cleaning, and loading data  Youare going to find problems with systems feeding the data warehouse  Youwill find the need to store/validate data not being captured/validated by any existing system  Large scale data warehousing can become an exercise in data homogenizing
  • 11. DATA WAREHOUSE PITFALLS…  The time it takes to load the warehouse will expand to the amount of the time in the available window... and then some  You are building a HIGH maintenance system  You will fail if you concentrate on resource optimization to the neglect of project, data, and customer management issues and an understanding of what adds value to the customer
  • 12. BEST PRACTICES  Complete requirements and design  Prototyping is key to business understanding  Utilizing proper aggregations and detailed data  Training is an on-going process  Build data integrity checks into your system.
  • 13. Top-Down Architecture BACK TO ARCHITECTURE
  • 14. Bottom-Up Architecture BACK TO ARCHITECTURE
  • 15. Enterprise Data Mart Architecture BACK TO ARCHITECTURE
  • 16. Data Stage/Data Mart Architecture BACK TO ARCHITECTURE

Editor's Notes

  1. Legacy data is historical dataThe working information of a staff member Working hours or time-off hours within the fiscal period, up to the current dateWorking Hours = Overtime, etc.Time-Off Hours = Vacation, Sick Leave, etc.
  2. DataStage database, toolA tool set for designing, developing, and runnin.gapplications that populate one or more tables in a data warehouse