SlideShare a Scribd company logo
1 of 49
Data Warehousing
Databases
 Databases are developed on the IDEA that
DATA is one of the critical materials of the
Information Age
 Information, which is created by data,
becomes the bases for decision making
Decision Support Systems
 Created to facilitate the decision making
process
 So much information that it is difficult to
extract it all from a traditional database
 Need for a more comprehensive data
storage facility
– Data Warehouse
Decision Support Systems
 Extract Information from data to use as the basis
for decision making
 Used at all levels of the Organization
 Tailored to specific business areas
 Interactive
 Ad Hoc queries to retrieve and display information
 Combines historical operation data with business
activities
4 Components of DSS
 Data Store – The DSS Database
– Business Data
– Business Model Data
– Internal and External Data
 Data Extraction and Filtering
– Extract and validate data from the operational
database and the external data sources
4 Components of DSS
 End-User Query Tool
– Create Queries that access either the
Operational or the DSS database
 End User Presentation Tools
– Organize and Present the Data
Differences with DSS
 Operational
– Stored in Normalized Relational Database
– Support transactions that represent daily
operations (Not Query Friendly)
 3 Main Differences
– Time Span
– Granularity
– Dimensionality
Time Span
 Operational
– Real Time
– Current Transactions
– Short Time Frame
– Specific Data Facts
 DSS
– Historic
– Long Time Frame (Months/Quarters/Years)
– Patterns
Granularity
 Operational
– Specific Transactions that occur at a given time
 DSS
– Shown at different levels of aggregation
– Different Summary Levels
– Decompose (drill down)
– Summarize (roll up)
Dimensionality
 Most distinguishing characteristic of DSS
data
 Operational
– Represents atomic transactions
 DSS
– Data is related in Many ways
– Develop the larger picture
– Multi-dimensional view of data
DSS Database Requirements
 DSS Database Scheme
– Support Complex and Non-Normalized data
 Summarized and Aggregate data
 Multiple Relationships
 Queries must extract multi-dimensional time slices
 Redundant Data
DSS Database Requirements
 Data Extraction and Filtering
– DSS databases are created mainly by extracting data
from operational databases combined with data
imported from external source
 Need for advanced data extraction & filtering tools
 Allow batch / scheduled data extraction
 Support different types of data sources
 Check for inconsistent data / data validation rules
 Support advanced data integration / data formatting conflicts
DSS Database Requirements
 End User Analytical Interface
– Must support advanced data modeling and data
presentation tools
– Data analysis tools
– Query generation
– Must Allow the User to Navigate through the DSS
 Size Requirements
– VERY Large – Terabytes
– Advanced Hardware (Multiple processors, multiple disk
arrays, etc.)
Data Warehouse
 DSS – friendly data repository for the DSS is
the DATA WAREHOUSE
 Definition: Integrated, Subject-Oriented,
Time-Variant, Nonvolatile database that
provides support for decision making
Integrated
 The data warehouse is a centralized,
consolidated database that integrated data
derived from the entire organization
– Multiple Sources
– Diverse Sources
– Diverse Formats
Subject-Oriented
 Data is arranged and optimized to provide
answer to questions from diverse functional
areas
– Data is organized and summarized by topic
 Sales / Marketing / Finance / Distribution / Etc.
Time-Variant
 The Data Warehouse represents the flow of
data through time
 Can contain projected data from statistical
models
 Data is periodically uploaded then time-
dependent data is recomputed
Nonvolatile
 Once data is entered it is NEVER removed
 Represents the company’s entire history
– Near term history is continually added to it
– Always growing
– Must support terabyte databases and
multiprocessors
 Read-Only database for data analysis and
query processing
Data Marts
 Small Data Stores
 More manageable data sets
 Targeted to meet the needs of small groups
within the organization
 Small, Single-Subject data warehouse
subset that provides decision support to a
small group of people
OLAP
 Online Analytical Processing Tools
 DSS tools that use multidimensional data
analysis techniques
– Support for a DSS data store
– Data extraction and integration filter
– Specialized presentation interface
12 Rules of a Data Warehouse
 Data Warehouse and Operational
Environments are Separated
 Data is integrated
 Contains historical data over a long period
of time
 Data is a snapshot data captured at a given
point in time
 Data is subject-oriented
12 Rules of Data Warehouse
 Mainly read-only with periodic batch updates
 Development Life Cycle has a data driven
approach versus the traditional process-
driven approach
 Data contains several levels of detail
– Current, Old, Lightly Summarized, Highly
Summarized
12 Rules of Data Warehouse
 Environment is characterized by Read-only
transactions to very large data sets
 System that traces data sources, transformations,
and storage
 Metadata is a critical component
– Source, transformation, integration, storage,
relationships, history, etc
 Contains a chargeback mechanism for resource
usage that enforces optimal use of data by end
users
OLAP
 Need for More Intensive Decision Support
 4 Main Characteristics
– Multidimensional data analysis
– Advanced Database Support
– Easy-to-use end-user interfaces
– Support Client/Server architecture
Multidimensional Data Analysis
Techniques
 Advanced Data Presentation Functions
– 3-D graphics, Pivot Tables, Crosstabs, etc.
– Compatible with Spreadsheets & Statistical
packages
– Advanced data aggregations, consolidation and
classification across time dimensions
– Advanced computational functions
– Advanced data modeling functions
Advanced Database Support
 Advanced Data Access Features
– Access to many kinds of DBMS’s, flat files, and
internal and external data sources
– Access to aggregated data warehouse data
– Advanced data navigation (drill-downs and roll-
ups)
– Ability to map end-user requests to the
appropriate data source
– Support for Very Large Databases
Easy-to-Use End-User Interface
 Graphical User Interfaces
 Much more useful if access is kept simple
Client/Server Architecture
 Framework for the new systems to be
designed, developed and implemented
 Divide the OLAP system into several
components that define its architecture
– Same Computer
– Distributed among several computer
OLAP Architecture
 3 Main Modules
– GUI
– Analytical Processing Logic
– Data-processing Logic
OLAP Client/Server
Architecture
Relational OLAP
 Relational Online Analytical Processing
– OLAP functionality using relational database
and familiar query tools to store and analyze
multidimensional data
 Multidimensional data schema support
 Data access language & query performance
for multidimensional data
 Support for Very Large Databases
Multidimensional Data Schema
Support
 Decision Support Data tends to be
– Nonnormalized
– Duplicated
– Preaggregated
 Star Schema
– Special Design technique for multidimensional
data representations
– Optimize data query operations instead of data
update operations
Star Schemas
 Data Modeling Technique to map
multidimensional decision support data into
a relational database
 Current Relational modeling techniques do
not serve the needs of advanced data
requirements
Star Schema
 4 Components
– Facts
– Dimensions
– Attributes
– Attribute Hierarchies
Facts
 Numeric measurements (values) that represent a
specific business aspect or activity
 Stored in a fact table at the center of the star
scheme
 Contains facts that are linked through their
dimensions
 Can be computed or derived at run time
 Updated periodically with data from operational
databases
Dimensions
 Qualifying characteristics that provide
additional perspectives to a given fact
– DSS data is almost always viewed in relation to
other data
 Dimensions are normally stored in
dimension tables
Attributes
 Dimension Tables contain Attributes
 Attributes are used to search, filter, or classify
facts
 Dimensions provide descriptive characteristics
about the facts through their attributed
 Must define common business attributes that will
be used to narrow a search, group information, or
describe dimensions. (ex.: Time / Location /
Product)
 No mathematical limit to the number of dimensions
(3-D makes it easy to model)
Attribute Hierarchies
 Provides a Top-Down data organization
– Aggregation
– Drill-down / Roll-Up data analysis
 Attributes from different dimensions can be
grouped to form a hierarchy
Star Schema for Sales
Fact Table
Dimension
Tables
Star Schema Representation
 Fact and Dimensions are represented by physical
tables in the data warehouse database
 Fact tables are related to each dimension table in
a Many to One relationship (Primary/Foreign Key
Relationships)
 Fact Table is related to many dimension tables
– The primary key of the fact table is a composite primary
key from the dimension tables
 Each fact table is designed to answer a specific
DSS question
Star Schema
 The fact table is always the larges table in
the star schema
 Each dimension record is related to
thousand of fact records
 Star Schema facilitated data retrieval
functions
 DBMS first searches the Dimension Tables
before the larger fact table
Data Warehouse Implementation
 An Active Decision Support Framework
– Not a Static Database
– Always a Work in Process
– Complete Infrastructure for Company-Wide
decision support
– Hardware / Software / People / Procedures /
Data
– Data Warehouse is a critical component of the
Modern DSS – But not the Only critical
component
Data Mining
 Discover Previously unknown data
characteristics, relationships, dependencies,
or trends
 Typical Data Analysis Relies on end users
– Define the Problem
– Select the Data
– Initial the Data Analysis
– Reacts to External Stimulus
Data Mining
 Proactive
 Automatically searches
– Anomalies
– Possible Relationships
– Identify Problems before the end-user
 Data Mining tools analyze the data, uncover
problems or opportunities hidden in data
relationships, form computer models based on
their findings, and then user the models to predict
business behavior – with minimal end-user
intervention
Data Mining
 A methodology designed to perform
knowledge-discovery expeditions over the
database data with minimal end-user
intervention
 3 Stages of Data
– Data
– Information
– Knowledge
Extraction of Knowledge from
Data
4 Phases of Data Mining
 Data Preparation
– Identify the main data sets to be used by the
data mining operation (usually the data
warehouse)
 Data Analysis and Classification
– Study the data to identify common data
characteristics or patterns
 Data groupings, classifications, clusters, sequences
 Data dependencies, links, or relationships
 Data patterns, trends, deviation
4 Phases of Data Mining
 Knowledge Acquisition
– Uses the Results of the Data Analysis and Classification phase
– Data mining tool selects the appropriate modeling or knowledge-
acquisition algorithms
 Neural Networks
 Decision Trees
 Rules Induction
 Genetic algorithms
 Memory-Based Reasoning
 Prognosis
– Predict Future Behavior
– Forecast Business Outcomes
 65% of customers who did not use a particular credit card in the last 6
months are 88% likely to cancel the account.
Data Mining
 Still a New Technique
 May find many Unmeaningful Relationships
 Good at finding Practical Relationships
– Define Customer Buying Patterns
– Improve Product Development and Acceptance
– Etc.
 Potential of becoming the next frontier in
database development

More Related Content

Similar to LECTURE4.ppt

11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4ambujm
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingsumit621
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Harish Chand
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.pptArbazAli27
 
Business intelligence and data warehouses
Business intelligence and data warehousesBusiness intelligence and data warehouses
Business intelligence and data warehousesDhani Ahmad
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3ambujm
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
DATA RESOURCE MANAGEMENT
DATA RESOURCE MANAGEMENT DATA RESOURCE MANAGEMENT
DATA RESOURCE MANAGEMENT huma sh
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database managementOnline
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)Syaifuddin Ismail
 

Similar to LECTURE4.ppt (20)

11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.ppt
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Business intelligence and data warehouses
Business intelligence and data warehousesBusiness intelligence and data warehouses
Business intelligence and data warehouses
 
DW 101
DW 101DW 101
DW 101
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
DATA RESOURCE MANAGEMENT
DATA RESOURCE MANAGEMENT DATA RESOURCE MANAGEMENT
DATA RESOURCE MANAGEMENT
 
Dwbasics
DwbasicsDwbasics
Dwbasics
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)
 

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

LECTURE4.ppt

  • 2. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information, which is created by data, becomes the bases for decision making
  • 3. Decision Support Systems  Created to facilitate the decision making process  So much information that it is difficult to extract it all from a traditional database  Need for a more comprehensive data storage facility – Data Warehouse
  • 4. Decision Support Systems  Extract Information from data to use as the basis for decision making  Used at all levels of the Organization  Tailored to specific business areas  Interactive  Ad Hoc queries to retrieve and display information  Combines historical operation data with business activities
  • 5. 4 Components of DSS  Data Store – The DSS Database – Business Data – Business Model Data – Internal and External Data  Data Extraction and Filtering – Extract and validate data from the operational database and the external data sources
  • 6. 4 Components of DSS  End-User Query Tool – Create Queries that access either the Operational or the DSS database  End User Presentation Tools – Organize and Present the Data
  • 7. Differences with DSS  Operational – Stored in Normalized Relational Database – Support transactions that represent daily operations (Not Query Friendly)  3 Main Differences – Time Span – Granularity – Dimensionality
  • 8. Time Span  Operational – Real Time – Current Transactions – Short Time Frame – Specific Data Facts  DSS – Historic – Long Time Frame (Months/Quarters/Years) – Patterns
  • 9. Granularity  Operational – Specific Transactions that occur at a given time  DSS – Shown at different levels of aggregation – Different Summary Levels – Decompose (drill down) – Summarize (roll up)
  • 10. Dimensionality  Most distinguishing characteristic of DSS data  Operational – Represents atomic transactions  DSS – Data is related in Many ways – Develop the larger picture – Multi-dimensional view of data
  • 11. DSS Database Requirements  DSS Database Scheme – Support Complex and Non-Normalized data  Summarized and Aggregate data  Multiple Relationships  Queries must extract multi-dimensional time slices  Redundant Data
  • 12. DSS Database Requirements  Data Extraction and Filtering – DSS databases are created mainly by extracting data from operational databases combined with data imported from external source  Need for advanced data extraction & filtering tools  Allow batch / scheduled data extraction  Support different types of data sources  Check for inconsistent data / data validation rules  Support advanced data integration / data formatting conflicts
  • 13. DSS Database Requirements  End User Analytical Interface – Must support advanced data modeling and data presentation tools – Data analysis tools – Query generation – Must Allow the User to Navigate through the DSS  Size Requirements – VERY Large – Terabytes – Advanced Hardware (Multiple processors, multiple disk arrays, etc.)
  • 14. Data Warehouse  DSS – friendly data repository for the DSS is the DATA WAREHOUSE  Definition: Integrated, Subject-Oriented, Time-Variant, Nonvolatile database that provides support for decision making
  • 15. Integrated  The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization – Multiple Sources – Diverse Sources – Diverse Formats
  • 16. Subject-Oriented  Data is arranged and optimized to provide answer to questions from diverse functional areas – Data is organized and summarized by topic  Sales / Marketing / Finance / Distribution / Etc.
  • 17. Time-Variant  The Data Warehouse represents the flow of data through time  Can contain projected data from statistical models  Data is periodically uploaded then time- dependent data is recomputed
  • 18. Nonvolatile  Once data is entered it is NEVER removed  Represents the company’s entire history – Near term history is continually added to it – Always growing – Must support terabyte databases and multiprocessors  Read-Only database for data analysis and query processing
  • 19. Data Marts  Small Data Stores  More manageable data sets  Targeted to meet the needs of small groups within the organization  Small, Single-Subject data warehouse subset that provides decision support to a small group of people
  • 20. OLAP  Online Analytical Processing Tools  DSS tools that use multidimensional data analysis techniques – Support for a DSS data store – Data extraction and integration filter – Specialized presentation interface
  • 21. 12 Rules of a Data Warehouse  Data Warehouse and Operational Environments are Separated  Data is integrated  Contains historical data over a long period of time  Data is a snapshot data captured at a given point in time  Data is subject-oriented
  • 22. 12 Rules of Data Warehouse  Mainly read-only with periodic batch updates  Development Life Cycle has a data driven approach versus the traditional process- driven approach  Data contains several levels of detail – Current, Old, Lightly Summarized, Highly Summarized
  • 23. 12 Rules of Data Warehouse  Environment is characterized by Read-only transactions to very large data sets  System that traces data sources, transformations, and storage  Metadata is a critical component – Source, transformation, integration, storage, relationships, history, etc  Contains a chargeback mechanism for resource usage that enforces optimal use of data by end users
  • 24. OLAP  Need for More Intensive Decision Support  4 Main Characteristics – Multidimensional data analysis – Advanced Database Support – Easy-to-use end-user interfaces – Support Client/Server architecture
  • 25. Multidimensional Data Analysis Techniques  Advanced Data Presentation Functions – 3-D graphics, Pivot Tables, Crosstabs, etc. – Compatible with Spreadsheets & Statistical packages – Advanced data aggregations, consolidation and classification across time dimensions – Advanced computational functions – Advanced data modeling functions
  • 26. Advanced Database Support  Advanced Data Access Features – Access to many kinds of DBMS’s, flat files, and internal and external data sources – Access to aggregated data warehouse data – Advanced data navigation (drill-downs and roll- ups) – Ability to map end-user requests to the appropriate data source – Support for Very Large Databases
  • 27. Easy-to-Use End-User Interface  Graphical User Interfaces  Much more useful if access is kept simple
  • 28. Client/Server Architecture  Framework for the new systems to be designed, developed and implemented  Divide the OLAP system into several components that define its architecture – Same Computer – Distributed among several computer
  • 29. OLAP Architecture  3 Main Modules – GUI – Analytical Processing Logic – Data-processing Logic
  • 31. Relational OLAP  Relational Online Analytical Processing – OLAP functionality using relational database and familiar query tools to store and analyze multidimensional data  Multidimensional data schema support  Data access language & query performance for multidimensional data  Support for Very Large Databases
  • 32. Multidimensional Data Schema Support  Decision Support Data tends to be – Nonnormalized – Duplicated – Preaggregated  Star Schema – Special Design technique for multidimensional data representations – Optimize data query operations instead of data update operations
  • 33. Star Schemas  Data Modeling Technique to map multidimensional decision support data into a relational database  Current Relational modeling techniques do not serve the needs of advanced data requirements
  • 34. Star Schema  4 Components – Facts – Dimensions – Attributes – Attribute Hierarchies
  • 35. Facts  Numeric measurements (values) that represent a specific business aspect or activity  Stored in a fact table at the center of the star scheme  Contains facts that are linked through their dimensions  Can be computed or derived at run time  Updated periodically with data from operational databases
  • 36. Dimensions  Qualifying characteristics that provide additional perspectives to a given fact – DSS data is almost always viewed in relation to other data  Dimensions are normally stored in dimension tables
  • 37. Attributes  Dimension Tables contain Attributes  Attributes are used to search, filter, or classify facts  Dimensions provide descriptive characteristics about the facts through their attributed  Must define common business attributes that will be used to narrow a search, group information, or describe dimensions. (ex.: Time / Location / Product)  No mathematical limit to the number of dimensions (3-D makes it easy to model)
  • 38. Attribute Hierarchies  Provides a Top-Down data organization – Aggregation – Drill-down / Roll-Up data analysis  Attributes from different dimensions can be grouped to form a hierarchy
  • 39. Star Schema for Sales Fact Table Dimension Tables
  • 40. Star Schema Representation  Fact and Dimensions are represented by physical tables in the data warehouse database  Fact tables are related to each dimension table in a Many to One relationship (Primary/Foreign Key Relationships)  Fact Table is related to many dimension tables – The primary key of the fact table is a composite primary key from the dimension tables  Each fact table is designed to answer a specific DSS question
  • 41. Star Schema  The fact table is always the larges table in the star schema  Each dimension record is related to thousand of fact records  Star Schema facilitated data retrieval functions  DBMS first searches the Dimension Tables before the larger fact table
  • 42. Data Warehouse Implementation  An Active Decision Support Framework – Not a Static Database – Always a Work in Process – Complete Infrastructure for Company-Wide decision support – Hardware / Software / People / Procedures / Data – Data Warehouse is a critical component of the Modern DSS – But not the Only critical component
  • 43. Data Mining  Discover Previously unknown data characteristics, relationships, dependencies, or trends  Typical Data Analysis Relies on end users – Define the Problem – Select the Data – Initial the Data Analysis – Reacts to External Stimulus
  • 44. Data Mining  Proactive  Automatically searches – Anomalies – Possible Relationships – Identify Problems before the end-user  Data Mining tools analyze the data, uncover problems or opportunities hidden in data relationships, form computer models based on their findings, and then user the models to predict business behavior – with minimal end-user intervention
  • 45. Data Mining  A methodology designed to perform knowledge-discovery expeditions over the database data with minimal end-user intervention  3 Stages of Data – Data – Information – Knowledge
  • 47. 4 Phases of Data Mining  Data Preparation – Identify the main data sets to be used by the data mining operation (usually the data warehouse)  Data Analysis and Classification – Study the data to identify common data characteristics or patterns  Data groupings, classifications, clusters, sequences  Data dependencies, links, or relationships  Data patterns, trends, deviation
  • 48. 4 Phases of Data Mining  Knowledge Acquisition – Uses the Results of the Data Analysis and Classification phase – Data mining tool selects the appropriate modeling or knowledge- acquisition algorithms  Neural Networks  Decision Trees  Rules Induction  Genetic algorithms  Memory-Based Reasoning  Prognosis – Predict Future Behavior – Forecast Business Outcomes  65% of customers who did not use a particular credit card in the last 6 months are 88% likely to cancel the account.
  • 49. Data Mining  Still a New Technique  May find many Unmeaningful Relationships  Good at finding Practical Relationships – Define Customer Buying Patterns – Improve Product Development and Acceptance – Etc.  Potential of becoming the next frontier in database development