SlideShare a Scribd company logo
Oracle: Data Warehouse Design
Characteristics of a Data Warehouse

  A data warehouse is a database designed for
   querying, reporting, and analysis.
  A data warehouse contains historical data
   derived from transaction data.
  Data warehouses separate analysis workload
   from transaction workload.
  A data warehouse is primarily
   an analytical tool.
Comparing OLTP and Data Warehouses
          OLTP                        Data Warehouse



   Many              Joins            Some


   Comparatively   Data accessed by   Large
   lower           queries            amount


   Normalized       Duplicated data   Denormalized
   DBMS                               DBMS

                   Derived data
   Rare            and                Common
                   aggregates
Data Warehouse Architectures

                                                                     Analysis
Operational
systems
                                  Metadata                Sales


                                                        Purchasing
                        Materialized
                                             Raw data
              Staging   views
              area                                                        Reporting
                                                          Inventory




 Flat files                                                       Data mining
Data Warehouse Design
• Key data warehouse design considerations:
  – Identify the specific data content.
  – Recognize the critical relationships within and
    between groups of data.
  – Define the system environment
    supporting your data warehouse.
  – Identify the required data
    transformations.
  – Calculate the frequency at which
    the data must be refreshed.
Logical Design
– A logical design is conceptual and
  abstract.
– Entity-relationship (ER) modeling
  is useful in identifying logical
  information requirements.
   • An entity represents a chunk of data.
   • The properties of entities are known as attributes.
   • The links between entities and attributes are known
     as relationships.
– Dimensional modeling is a specialized
  type of ER modeling useful in data warehouse
  design.
Oracle Warehouse Builder
– Oracle Database provides tools to implement
  the ETL process.
   • Oracle Warehouse Builder is a tool to help in this
     process.
– Oracle Warehouse Builder generates the
  following types of code:
   •   SQL data definition language (DDL) scripts
   •   PL/SQL programs
   •   SQL*Loader control files
   •   XML Processing Description Language (XPDL)
   •   ABAP code (used to extract data from SAP systems)
Data Warehousing Schemas
– Objects can be arranged in data warehousing
  schema models in a variety of ways:
   •   Star schema
   •   Snowflake schema
   •   Third normal form (3NF) schema
   •   Hybrid schemas
– The source data model and user
  requirements should steer the data
  warehouse schema.
– Implementation of the logical model may
  require changes to enable you to adapt it to
  your physical system.
Schema Characteristics
– Star schema
   • Characterized by one or more large fact tables and a
     number of much smaller dimension tables
   • Each dimension table joined to the fact table using a
     primary key to foreign key join
– Snowflake schema
   • Dimension data grouped into multiple tables instead
     of one large table
   • Increased number of dimension tables, requiring
     more foreign key joins
– Third normal form (3NF) schema
   • A classical relational-database model that minimizes
     data redundancy through normalization
Data Warehousing Objects
– Fact tables
   • Fact tables are the large tables that store business
     measurements.
– Dimension tables
   • A dimension is a structure composed of one or more
     hierarchies that categorizes data.
   • Unique identifiers are specified for one distinct
     record in a dimension table.
– Relationships
   • Relationships guarantee
     integrity of business
     information.
Fact Tables
– A fact table must be defined for each star schema.
– Fact tables are the large tables that store business
  measurements.
– A fact table contains either detail-level or
  aggregated facts.
– A fact table usually contains facts with the same
  level of aggregation.
– The primary key of the fact table is
  usually a composite key made up
  of all its foreign keys.
Dimensions and Hierarchies
                                   CUSTOMERS dimension
– A dimension is a structure       hierarchy (by level)
  composed of one or more
  hierarchies that categorizes data. REGION
– Dimensional attributes help to
  describe the dimensional value.         SUBREGION

– Dimension data is collected at the
  lowest level of detail and aggregatedCOUNTRY
  into higher level totals.
– Hierarchies are structures that use STATE
  ordered levels to organize data.
– In a hierarchy, each level is           CITY

  connected to the levels above and
  below it.                               CUSTOMER
Dimensions and Hierarchies

 PRODUCTS                             CUSTOMERS
 #prod_id         Unique identifier   #cust_id
                    Fact table        cust_last_name
                                       cust_city
                                       cust_state_province
   Relationship    SALES
                   cust_id
                   prod_id                     Hierarchy




 TIMES                                  CHANNELS
                  PROMOTIONS

Dimension table                        Dimension table
                  Dimension table
Physical Design
  Logical             Physical (Tablespaces)


Entities           Tables                      Indexes



                                               Materialized
Relationships          Integrity
                                               views
                       constraints
                     - Primary key
                     - Foreign key
Attributes           - Not null                Dimensions



Unique
identifiers        Columns
Data Warehouse Physical Structures

• Tables and partitioned tables
  – Partitioned tables enable you to split
    large data volumes into smaller,
    more manageable pieces.
  – Expect performance benefits from:
     • Partition pruning
     • Intelligent parallel processing
  – Compressed tables offer scaleup opportunities
    for read-only operations.
  – Table compression saves disk space.
Data Warehouse Physical Structures

  – Views:
     • Are tailored presentations of data contained in one
       or more tables or views
     • Do not require any space in the database
  – Materialized views:
     • Are query results that have been stored in advance
     • (Like indexes) are used transparently and improve
       performance
  – Integrity constraints:
     • Are used in data warehouses for query rewrite
  – Dimensions:
     • Are containers of logical relationships and do not
       require any space in the database
Managing Large Volumes of Data
• Work smarter in your data warehouse:
  –   Partitioning
  –   Bitmap indexes/Star transformation
  –   Data compression
  –   Query rewrite
• Work harder in your data warehouse:
  – Parallelism for all operations
       • DBA tasks, such as loading, index creation, table
         creation, data modification, backup and recovery
       • End-user operations, such as queries
       • Unbounded scalability: Real Application Clusters
I/O Performance in Data Warehouses

  – I/O is typically the primary determinant of data
    warehouse performance.
  – Data warehouse storage configurations should be
    chosen by I/O bandwidth, not storage capacity.
  – Every component of the I/O
    subsystem should provide
    enough bandwidth:
     • Disks
     • I/O channels
     • I/O adapters
  – In data warehouses, maximizing
    sequential I/O throughput is critical.
I/O Scalability
Parallel execution:
     – Reduces response time for data-intensive operations on large
       databases
     – Benefits systems with the following characteristics:
          • Multiprocessors, clusters, or massively parallel systems
          • Sufficient I/O bandwidth
          • Sufficient memory to support memory-intensive processes such
            as sorts, hashing, and I/O buffers

                                Query servers
                                                               Coordinator
 Data on disk          Scan                     Sort Q1

                       Scan                     Sort Q2
                                                               Dispatch
                                                               work
                       Scan                     Sort Q3

                       Scan                     Sort Q4
                     Scanners          Sorters (Aggregators)
I/O Scalability

• Automatic Storage Management (ASM)
  – Configuring storage for a DB depends on many
    variables:
     •   Which data to put on which disk
     •   Logical unit number (LUN) configurations
     •   DB types and workloads; data warehouse, OLTP, DSS
     •   Trade-offs between available options
  – ASM provides solutions to storage issues
    encountered in data warehouses.
I/O Scalability

• Automatic Storage Management: Overview
  – Portable and high-performance
    cluster file system                Application
  – Manages Oracle database files
  – Data spread across disks                Database
    to balance load                  File
  – Integrated mirroring across      system
                                                     ASM
    disks                            Volume
                                     manager
  – Solves many storage
    management challenges           Operating system
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

More Related Content

What's hot

Oracle Data Warehouse
Oracle Data WarehouseOracle Data Warehouse
Oracle Data Warehouse
DataminingTools Inc
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
Uday Kothari
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
Deepak Chaurasia
 
Gic2011 aula3-ingles
Gic2011 aula3-inglesGic2011 aula3-ingles
Gic2011 aula3-ingles
Marielba-Mayeya Zacarias
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
Sayed Ahmed
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
Gaurav Garg
 
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSINGSKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
Skillwise Group
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Shifali Goyal
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouse
Srinivasan R
 
OLAP
OLAPOLAP
OLAP
Ashir Ali
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
Advance Saraswati Prakashan Pvt Ltd
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
yasir873
 
OLAP
OLAPOLAP
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
DataminingTools Inc
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecture
hasanshan
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
Krish_ver2
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
Mr. Fmhyudin
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
SAS SNDP YOGAM COLLEGE,KONNI
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real world
ukc4
 
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
George Joseph
 

What's hot (20)

Oracle Data Warehouse
Oracle Data WarehouseOracle Data Warehouse
Oracle Data Warehouse
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
 
Gic2011 aula3-ingles
Gic2011 aula3-inglesGic2011 aula3-ingles
Gic2011 aula3-ingles
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
 
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSINGSKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
SKILLWISE-SSIS DESIGN PATTERN FOR DATA WAREHOUSING
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouse
 
OLAP
OLAPOLAP
OLAP
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
OLAP
OLAPOLAP
OLAP
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecture
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real world
 
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.
 

Viewers also liked

Oracle: PLSQL Introduction
Oracle: PLSQL IntroductionOracle: PLSQL Introduction
Oracle: PLSQL Introduction
oracle content
 
Oracle: Procedures
Oracle: ProceduresOracle: Procedures
Oracle: Procedures
oracle content
 
Oracle: Commands
Oracle: CommandsOracle: Commands
Oracle: Commands
oracle content
 
Oracle : DML
Oracle : DMLOracle : DML
Oracle : DML
oracle content
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
oracle content
 
Dml and ddl
Dml and ddlDml and ddl

Viewers also liked (6)

Oracle: PLSQL Introduction
Oracle: PLSQL IntroductionOracle: PLSQL Introduction
Oracle: PLSQL Introduction
 
Oracle: Procedures
Oracle: ProceduresOracle: Procedures
Oracle: Procedures
 
Oracle: Commands
Oracle: CommandsOracle: Commands
Oracle: Commands
 
Oracle : DML
Oracle : DMLOracle : DML
Oracle : DML
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
 
Dml and ddl
Dml and ddlDml and ddl
Dml and ddl
 

Similar to Oracle: Dw Design

Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
oracle content
 
DBMS
DBMSDBMS
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
Online
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Michael Rys
 
data warehousing
data warehousingdata warehousing
data warehousing
Tirath Mulani
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
ShaimaaMohamedGalal
 
Data Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptxData Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptx
Priyadarshini648418
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
Craig Jordan
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
Martin Bém
 
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
Business Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8amBusiness Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8am
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
Barrett Peterson
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
Amazon Web Services
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
Deepali Raut
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
cwensel
 
(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)
Dr. Mazin Mohamed alkathiri
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
Abishek V S
 
Computing 7
Computing 7Computing 7
Computing 7
sufyanmaqsood
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
JawaherAlbaddawi
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.com
Syed Hadoop
 

Similar to Oracle: Dw Design (20)

Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
 
DBMS
DBMSDBMS
DBMS
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 
Data Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptxData Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptx
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
Business Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8amBusiness Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8am
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Computing 7
Computing 7Computing 7
Computing 7
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.com
 

More from oracle content

Oracle: Programs
Oracle: ProgramsOracle: Programs
Oracle: Programs
oracle content
 
Oracle:Cursors
Oracle:CursorsOracle:Cursors
Oracle:Cursors
oracle content
 
Oracle: Control Structures
Oracle:  Control StructuresOracle:  Control Structures
Oracle: Control Structures
oracle content
 
Oracle: Basic SQL
Oracle: Basic SQLOracle: Basic SQL
Oracle: Basic SQL
oracle content
 
Oracle Warehouse
Oracle WarehouseOracle Warehouse
Oracle Warehouse
oracle content
 
Oracle: Functions
Oracle: FunctionsOracle: Functions
Oracle: Functions
oracle content
 
Oracle: New Plsql
Oracle: New PlsqlOracle: New Plsql
Oracle: New Plsql
oracle content
 

More from oracle content (7)

Oracle: Programs
Oracle: ProgramsOracle: Programs
Oracle: Programs
 
Oracle:Cursors
Oracle:CursorsOracle:Cursors
Oracle:Cursors
 
Oracle: Control Structures
Oracle:  Control StructuresOracle:  Control Structures
Oracle: Control Structures
 
Oracle: Basic SQL
Oracle: Basic SQLOracle: Basic SQL
Oracle: Basic SQL
 
Oracle Warehouse
Oracle WarehouseOracle Warehouse
Oracle Warehouse
 
Oracle: Functions
Oracle: FunctionsOracle: Functions
Oracle: Functions
 
Oracle: New Plsql
Oracle: New PlsqlOracle: New Plsql
Oracle: New Plsql
 

Recently uploaded

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 

Oracle: Dw Design

  • 2. Characteristics of a Data Warehouse A data warehouse is a database designed for querying, reporting, and analysis. A data warehouse contains historical data derived from transaction data. Data warehouses separate analysis workload from transaction workload. A data warehouse is primarily an analytical tool.
  • 3. Comparing OLTP and Data Warehouses OLTP Data Warehouse Many Joins Some Comparatively Data accessed by Large lower queries amount Normalized Duplicated data Denormalized DBMS DBMS Derived data Rare and Common aggregates
  • 4. Data Warehouse Architectures Analysis Operational systems Metadata Sales Purchasing Materialized Raw data Staging views area Reporting Inventory Flat files Data mining
  • 5. Data Warehouse Design • Key data warehouse design considerations: – Identify the specific data content. – Recognize the critical relationships within and between groups of data. – Define the system environment supporting your data warehouse. – Identify the required data transformations. – Calculate the frequency at which the data must be refreshed.
  • 6. Logical Design – A logical design is conceptual and abstract. – Entity-relationship (ER) modeling is useful in identifying logical information requirements. • An entity represents a chunk of data. • The properties of entities are known as attributes. • The links between entities and attributes are known as relationships. – Dimensional modeling is a specialized type of ER modeling useful in data warehouse design.
  • 7. Oracle Warehouse Builder – Oracle Database provides tools to implement the ETL process. • Oracle Warehouse Builder is a tool to help in this process. – Oracle Warehouse Builder generates the following types of code: • SQL data definition language (DDL) scripts • PL/SQL programs • SQL*Loader control files • XML Processing Description Language (XPDL) • ABAP code (used to extract data from SAP systems)
  • 8. Data Warehousing Schemas – Objects can be arranged in data warehousing schema models in a variety of ways: • Star schema • Snowflake schema • Third normal form (3NF) schema • Hybrid schemas – The source data model and user requirements should steer the data warehouse schema. – Implementation of the logical model may require changes to enable you to adapt it to your physical system.
  • 9. Schema Characteristics – Star schema • Characterized by one or more large fact tables and a number of much smaller dimension tables • Each dimension table joined to the fact table using a primary key to foreign key join – Snowflake schema • Dimension data grouped into multiple tables instead of one large table • Increased number of dimension tables, requiring more foreign key joins – Third normal form (3NF) schema • A classical relational-database model that minimizes data redundancy through normalization
  • 10. Data Warehousing Objects – Fact tables • Fact tables are the large tables that store business measurements. – Dimension tables • A dimension is a structure composed of one or more hierarchies that categorizes data. • Unique identifiers are specified for one distinct record in a dimension table. – Relationships • Relationships guarantee integrity of business information.
  • 11. Fact Tables – A fact table must be defined for each star schema. – Fact tables are the large tables that store business measurements. – A fact table contains either detail-level or aggregated facts. – A fact table usually contains facts with the same level of aggregation. – The primary key of the fact table is usually a composite key made up of all its foreign keys.
  • 12. Dimensions and Hierarchies CUSTOMERS dimension – A dimension is a structure hierarchy (by level) composed of one or more hierarchies that categorizes data. REGION – Dimensional attributes help to describe the dimensional value. SUBREGION – Dimension data is collected at the lowest level of detail and aggregatedCOUNTRY into higher level totals. – Hierarchies are structures that use STATE ordered levels to organize data. – In a hierarchy, each level is CITY connected to the levels above and below it. CUSTOMER
  • 13. Dimensions and Hierarchies PRODUCTS CUSTOMERS #prod_id Unique identifier #cust_id Fact table cust_last_name cust_city cust_state_province Relationship SALES cust_id prod_id Hierarchy TIMES CHANNELS PROMOTIONS Dimension table Dimension table Dimension table
  • 14. Physical Design Logical Physical (Tablespaces) Entities Tables Indexes Materialized Relationships Integrity views constraints - Primary key - Foreign key Attributes - Not null Dimensions Unique identifiers Columns
  • 15. Data Warehouse Physical Structures • Tables and partitioned tables – Partitioned tables enable you to split large data volumes into smaller, more manageable pieces. – Expect performance benefits from: • Partition pruning • Intelligent parallel processing – Compressed tables offer scaleup opportunities for read-only operations. – Table compression saves disk space.
  • 16. Data Warehouse Physical Structures – Views: • Are tailored presentations of data contained in one or more tables or views • Do not require any space in the database – Materialized views: • Are query results that have been stored in advance • (Like indexes) are used transparently and improve performance – Integrity constraints: • Are used in data warehouses for query rewrite – Dimensions: • Are containers of logical relationships and do not require any space in the database
  • 17. Managing Large Volumes of Data • Work smarter in your data warehouse: – Partitioning – Bitmap indexes/Star transformation – Data compression – Query rewrite • Work harder in your data warehouse: – Parallelism for all operations • DBA tasks, such as loading, index creation, table creation, data modification, backup and recovery • End-user operations, such as queries • Unbounded scalability: Real Application Clusters
  • 18. I/O Performance in Data Warehouses – I/O is typically the primary determinant of data warehouse performance. – Data warehouse storage configurations should be chosen by I/O bandwidth, not storage capacity. – Every component of the I/O subsystem should provide enough bandwidth: • Disks • I/O channels • I/O adapters – In data warehouses, maximizing sequential I/O throughput is critical.
  • 19. I/O Scalability Parallel execution: – Reduces response time for data-intensive operations on large databases – Benefits systems with the following characteristics: • Multiprocessors, clusters, or massively parallel systems • Sufficient I/O bandwidth • Sufficient memory to support memory-intensive processes such as sorts, hashing, and I/O buffers Query servers Coordinator Data on disk Scan Sort Q1 Scan Sort Q2 Dispatch work Scan Sort Q3 Scan Sort Q4 Scanners Sorters (Aggregators)
  • 20. I/O Scalability • Automatic Storage Management (ASM) – Configuring storage for a DB depends on many variables: • Which data to put on which disk • Logical unit number (LUN) configurations • DB types and workloads; data warehouse, OLTP, DSS • Trade-offs between available options – ASM provides solutions to storage issues encountered in data warehouses.
  • 21. I/O Scalability • Automatic Storage Management: Overview – Portable and high-performance cluster file system Application – Manages Oracle database files – Data spread across disks Database to balance load File – Integrated mirroring across system ASM disks Volume manager – Solves many storage management challenges Operating system
  • 22. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net