SlideShare a Scribd company logo
1 of 21
DATA WAREHOUSING/MINING
PREPARE BY: R. KRISHNARAJ
DATA WAREHOUSING INTRODUCTION
• A data warehouse is a central repository of information that can be
analyzed to make more informed decisions.
THE WAREHOUSING APPROACH
Data
Warehouse
Clients
Source Source
Source
. . .
Extractor/
Monitor
Integration System
. . .
Metadata
Extractor/
Monitor
Extractor/
Monitor
 Information
integrated in
advance
 Stored in wh for
direct querying
and analysis
ADVANTAGES OF WAREHOUSING APPROACH
• High query performance
• But not necessarily most current information
• Doesn’t interfere with local processing at sources
• Complex queries at warehouse
• OLTP at information sources
• Information copied at warehouse
• Can modify, annotate, summarize, restructure, etc.
• Can store historical information
• Security, no auditing
• Has caught on in industry
DATA WAREHOUSE EVOLUTION
TIME
2000
1995
1990
1985
1980
1960 1975
Information-
Based
Management
Data
Revolution
“Middle
Ages”
“Prehistoric
Times”
Relational
Databases
PC’s and
Spreadsheets
End-user
Interfaces
1st DW
Article
DW
Confs.
Vendor DW
Frameworks
Company
DWs
“Building the
DW”
Inmon (1992)
Data Replication
Tools
WHAT IS A DATA WAREHOUSE?
A PRACTITIONERS VIEWPOINT
“A data warehouse is simply a single, complete, and consistent
store of data obtained from a variety of sources and made
available to end users in a way they can understand and use it in a
business context.”
-- Barry Devlin, IBM Consultant
A DATA WAREHOUSE IS...
• Stored collection of diverse data
• A solution to data integration problem
• Single repository of information
• Subject-oriented
• Organized by subject, not by application
• Used for analysis, data mining, etc.
• Optimized differently from transaction-oriented db
• User interface aimed at executive
A DATA WAREHOUSE IS... (CONTINUED)
• Large volume of data (Gb, Tb)
• Non-volatile
• Historical
• Time attributes are important
• Updates infrequent
• May be append-only
• Examples
• All transactions ever at WalMart
• Complete client histories at insurance firm
• Stockbroker financial information and portfolios
SUMMARY
Operational Systems
Enterprise
Modeling
Business
Information Guide
Data
Warehouse
Catalog
Data Warehouse
Population
Data
Warehouse
Business Information
Interface
WAREHOUSE IS A SPECIALIZED DB
Standard DB
• Mostly updates
• Many small transactions
• Mb - Gb of data
• Current snapshot
• Index/hash on p.k.
• Raw data
• Thousands of users (e.g.,
clerical users)
Warehouse
 Mostly reads
 Queries are long and complex
 Gb - Tb of data
 History
 Lots of scans
 Summarized, reconciled data
 Hundreds of users (e.g.,
decision-makers, analysts)
DATA WAREHOUSE ARCHITECTURES:
CONCEPTUAL VIEW
• Single-layer
• Every data element is stored once only
• Virtual warehouse
• Two-layer
• Real-time + derived data
• Most commonly used approach in
industry today
“Real-time data”
Operational
systems
Informational
systems
Derived Data
Real-time data
Operational
systems
Informational
systems
THREE-LAYER ARCHITECTURE: CONCEPTUAL
VIEW
• Transformation of real-time data to derived data really requires
two steps
Derived Data
Real-time data
Operational
systems
Informational
systems
Reconciled Data
Physical Implementation
of the Data Warehouse
View level
“Particular informational
needs”
WAREHOUSE ARCHITECTURE
Source Source Source
Extractor/
Monitor
Extractor/
Monitor
Extractor/
Monitor
Integrator
Warehouse
Query & Analysis
Client Client
...
Metadata
WHAT IS DATA MINING?
• Data Mining is:
• (1) The efficient discovery of previously unknown, valid,
potentially useful, understandable patterns in large datasets
• (2) The analysis of (often large) observational data sets to find
unsuspected relationships and to summarize the data in
novel ways that are both understandable and useful to the
data owner
OVERVIEW OF TERMS
• Data: a set of facts (items) D, usually stored in a database
• Pattern: an expression E in a language L, that describes a subset
of facts
• Attribute: a field in an item I in D.
• Interestingness: a function ID,L that maps an expression E in L
into a measure space M
KNOWLEDGE DISCOVERY
EXAMPLES OF LARGE DATASETS
• Government: IRS, NGA, …
• Large corporations
• WALMART: 20M transactions per day
• MOBIL: 100 TB geological databases
• AT&T 300 M calls per day
• Credit card companies
• Scientific
• NASA, EOS project: 50 GB per hour
• Environmental datasets
EXAMPLES OF DATA MINING APPLICATIONS
1. Fraud detection: credit cards, phone cards
2. Marketing: customer targeting
3. Data Warehousing: Walmart
4. Astronomy
5. Molecular biology
THE DATA MINING PROCESS
1. Understand the domain
2. Create a dataset:
Select the interesting attributes
Data cleaning and preprocessing
3. Choose the data mining task and the specific algorithm
4. Interpret the results, and possibly return to 2
HOW DATA MINING IS USED
1. Identify the problem
2. Use data mining techniques to transform the data into informatio
3. Act on the information
4. Measure the results
THANK YOU

More Related Content

Similar to IM SEMINAR.pptx

Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OnePanchaleswar Nayak
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothAdaryl "Bob" Wakefield, MBA
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Introduction to Datawarehousing
Introduction to  DatawarehousingIntroduction to  Datawarehousing
Introduction to Datawarehousingkarunakar81987
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraCaserta
 
Data mining concept and methods for basic
Data mining concept and methods for basicData mining concept and methods for basic
Data mining concept and methods for basicNivaTripathy2
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!DataWorks Summit/Hadoop Summit
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introductionMurli Jha
 
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsDay 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsAseda Owusua Addai-Deseh
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution WSO2
 
Data Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsData Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsDenodo
 
Difference between Database vs Data Warehouse vs Data Lake
Difference between Database vs Data Warehouse vs Data LakeDifference between Database vs Data Warehouse vs Data Lake
Difference between Database vs Data Warehouse vs Data Lakejeetendra mandal
 
Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsQuontra Solutions
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Vibrant Technologies & Computers
 

Similar to IM SEMINAR.pptx (20)

Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 
Overview of dbms
Overview of dbmsOverview of dbms
Overview of dbms
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Introduction to Datawarehousing
Introduction to  DatawarehousingIntroduction to  Datawarehousing
Introduction to Datawarehousing
 
Big data
Big dataBig data
Big data
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
Data mining concept and methods for basic
Data mining concept and methods for basicData mining concept and methods for basic
Data mining concept and methods for basic
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all AnalyticsDay 1 (Lecture 1): Data Management- The Foundation of all Analytics
Day 1 (Lecture 1): Data Management- The Foundation of all Analytics
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
 
Data Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsData Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified Insights
 
Difference between Database vs Data Warehouse vs Data Lake
Difference between Database vs Data Warehouse vs Data LakeDifference between Database vs Data Warehouse vs Data Lake
Difference between Database vs Data Warehouse vs Data Lake
 
Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra Solutions
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 

More from KRISHNARAJ207

INTERNSHIP PRESENTATION.pptx
INTERNSHIP PRESENTATION.pptxINTERNSHIP PRESENTATION.pptx
INTERNSHIP PRESENTATION.pptxKRISHNARAJ207
 
22BAB09-OPERATION MANAGEMENT.pptx
22BAB09-OPERATION MANAGEMENT.pptx22BAB09-OPERATION MANAGEMENT.pptx
22BAB09-OPERATION MANAGEMENT.pptxKRISHNARAJ207
 
BATCH-4 PROJECT PPT.pdf
BATCH-4 PROJECT PPT.pdfBATCH-4 PROJECT PPT.pdf
BATCH-4 PROJECT PPT.pdfKRISHNARAJ207
 
BATCH-4 PROJECT .pdf
BATCH-4 PROJECT .pdfBATCH-4 PROJECT .pdf
BATCH-4 PROJECT .pdfKRISHNARAJ207
 
ankurjobrotation-140217103220-phpapp02.pptx
ankurjobrotation-140217103220-phpapp02.pptxankurjobrotation-140217103220-phpapp02.pptx
ankurjobrotation-140217103220-phpapp02.pptxKRISHNARAJ207
 
methodsofperformanceappraisal-170222015553.pptx
methodsofperformanceappraisal-170222015553.pptxmethodsofperformanceappraisal-170222015553.pptx
methodsofperformanceappraisal-170222015553.pptxKRISHNARAJ207
 
work culture ppt.pptx
work culture ppt.pptxwork culture ppt.pptx
work culture ppt.pptxKRISHNARAJ207
 
Elements of Systems Design.ppt
Elements of Systems Design.pptElements of Systems Design.ppt
Elements of Systems Design.pptKRISHNARAJ207
 
ultrasonic-welding-828-SNggldd.pptx
ultrasonic-welding-828-SNggldd.pptxultrasonic-welding-828-SNggldd.pptx
ultrasonic-welding-828-SNggldd.pptxKRISHNARAJ207
 

More from KRISHNARAJ207 (20)

INTERNSHIP PRESENTATION.pptx
INTERNSHIP PRESENTATION.pptxINTERNSHIP PRESENTATION.pptx
INTERNSHIP PRESENTATION.pptx
 
22BAB09-OPERATION MANAGEMENT.pptx
22BAB09-OPERATION MANAGEMENT.pptx22BAB09-OPERATION MANAGEMENT.pptx
22BAB09-OPERATION MANAGEMENT.pptx
 
22BA005.pptx
22BA005.pptx22BA005.pptx
22BA005.pptx
 
BATCH-4 PROJECT PPT.pdf
BATCH-4 PROJECT PPT.pdfBATCH-4 PROJECT PPT.pdf
BATCH-4 PROJECT PPT.pdf
 
BATCH-4 PROJECT .pdf
BATCH-4 PROJECT .pdfBATCH-4 PROJECT .pdf
BATCH-4 PROJECT .pdf
 
ankurjobrotation-140217103220-phpapp02.pptx
ankurjobrotation-140217103220-phpapp02.pptxankurjobrotation-140217103220-phpapp02.pptx
ankurjobrotation-140217103220-phpapp02.pptx
 
UNIT-1.ppt
UNIT-1.pptUNIT-1.ppt
UNIT-1.ppt
 
methodsofperformanceappraisal-170222015553.pptx
methodsofperformanceappraisal-170222015553.pptxmethodsofperformanceappraisal-170222015553.pptx
methodsofperformanceappraisal-170222015553.pptx
 
Unit 5 ED.pptx
Unit 5 ED.pptxUnit 5 ED.pptx
Unit 5 ED.pptx
 
work culture ppt.pptx
work culture ppt.pptxwork culture ppt.pptx
work culture ppt.pptx
 
Control.ppt
Control.pptControl.ppt
Control.ppt
 
22BA001 IE S 2.pptx
22BA001 IE S 2.pptx22BA001 IE S 2.pptx
22BA001 IE S 2.pptx
 
22BA001 IE 1.pptx
22BA001 IE 1.pptx22BA001 IE 1.pptx
22BA001 IE 1.pptx
 
22BA003 IE 1.pptx
22BA003 IE 1.pptx22BA003 IE 1.pptx
22BA003 IE 1.pptx
 
Elements of Systems Design.ppt
Elements of Systems Design.pptElements of Systems Design.ppt
Elements of Systems Design.ppt
 
Financial IS.pptx
Financial IS.pptxFinancial IS.pptx
Financial IS.pptx
 
dbms.ppt
dbms.pptdbms.ppt
dbms.ppt
 
DFD1.ppt
DFD1.pptDFD1.ppt
DFD1.ppt
 
group behaviour.ppt
group behaviour.pptgroup behaviour.ppt
group behaviour.ppt
 
ultrasonic-welding-828-SNggldd.pptx
ultrasonic-welding-828-SNggldd.pptxultrasonic-welding-828-SNggldd.pptx
ultrasonic-welding-828-SNggldd.pptx
 

Recently uploaded

Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Marketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet CreationsMarketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet Creationsnakalysalcedo61
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncrdollysharma2066
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 

Recently uploaded (20)

Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Marketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet CreationsMarketing Management Business Plan_My Sweet Creations
Marketing Management Business Plan_My Sweet Creations
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 

IM SEMINAR.pptx

  • 2. DATA WAREHOUSING INTRODUCTION • A data warehouse is a central repository of information that can be analyzed to make more informed decisions.
  • 3. THE WAREHOUSING APPROACH Data Warehouse Clients Source Source Source . . . Extractor/ Monitor Integration System . . . Metadata Extractor/ Monitor Extractor/ Monitor  Information integrated in advance  Stored in wh for direct querying and analysis
  • 4. ADVANTAGES OF WAREHOUSING APPROACH • High query performance • But not necessarily most current information • Doesn’t interfere with local processing at sources • Complex queries at warehouse • OLTP at information sources • Information copied at warehouse • Can modify, annotate, summarize, restructure, etc. • Can store historical information • Security, no auditing • Has caught on in industry
  • 5. DATA WAREHOUSE EVOLUTION TIME 2000 1995 1990 1985 1980 1960 1975 Information- Based Management Data Revolution “Middle Ages” “Prehistoric Times” Relational Databases PC’s and Spreadsheets End-user Interfaces 1st DW Article DW Confs. Vendor DW Frameworks Company DWs “Building the DW” Inmon (1992) Data Replication Tools
  • 6. WHAT IS A DATA WAREHOUSE? A PRACTITIONERS VIEWPOINT “A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.” -- Barry Devlin, IBM Consultant
  • 7. A DATA WAREHOUSE IS... • Stored collection of diverse data • A solution to data integration problem • Single repository of information • Subject-oriented • Organized by subject, not by application • Used for analysis, data mining, etc. • Optimized differently from transaction-oriented db • User interface aimed at executive
  • 8. A DATA WAREHOUSE IS... (CONTINUED) • Large volume of data (Gb, Tb) • Non-volatile • Historical • Time attributes are important • Updates infrequent • May be append-only • Examples • All transactions ever at WalMart • Complete client histories at insurance firm • Stockbroker financial information and portfolios
  • 9. SUMMARY Operational Systems Enterprise Modeling Business Information Guide Data Warehouse Catalog Data Warehouse Population Data Warehouse Business Information Interface
  • 10. WAREHOUSE IS A SPECIALIZED DB Standard DB • Mostly updates • Many small transactions • Mb - Gb of data • Current snapshot • Index/hash on p.k. • Raw data • Thousands of users (e.g., clerical users) Warehouse  Mostly reads  Queries are long and complex  Gb - Tb of data  History  Lots of scans  Summarized, reconciled data  Hundreds of users (e.g., decision-makers, analysts)
  • 11. DATA WAREHOUSE ARCHITECTURES: CONCEPTUAL VIEW • Single-layer • Every data element is stored once only • Virtual warehouse • Two-layer • Real-time + derived data • Most commonly used approach in industry today “Real-time data” Operational systems Informational systems Derived Data Real-time data Operational systems Informational systems
  • 12. THREE-LAYER ARCHITECTURE: CONCEPTUAL VIEW • Transformation of real-time data to derived data really requires two steps Derived Data Real-time data Operational systems Informational systems Reconciled Data Physical Implementation of the Data Warehouse View level “Particular informational needs”
  • 13. WAREHOUSE ARCHITECTURE Source Source Source Extractor/ Monitor Extractor/ Monitor Extractor/ Monitor Integrator Warehouse Query & Analysis Client Client ... Metadata
  • 14. WHAT IS DATA MINING? • Data Mining is: • (1) The efficient discovery of previously unknown, valid, potentially useful, understandable patterns in large datasets • (2) The analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner
  • 15. OVERVIEW OF TERMS • Data: a set of facts (items) D, usually stored in a database • Pattern: an expression E in a language L, that describes a subset of facts • Attribute: a field in an item I in D. • Interestingness: a function ID,L that maps an expression E in L into a measure space M
  • 17. EXAMPLES OF LARGE DATASETS • Government: IRS, NGA, … • Large corporations • WALMART: 20M transactions per day • MOBIL: 100 TB geological databases • AT&T 300 M calls per day • Credit card companies • Scientific • NASA, EOS project: 50 GB per hour • Environmental datasets
  • 18. EXAMPLES OF DATA MINING APPLICATIONS 1. Fraud detection: credit cards, phone cards 2. Marketing: customer targeting 3. Data Warehousing: Walmart 4. Astronomy 5. Molecular biology
  • 19. THE DATA MINING PROCESS 1. Understand the domain 2. Create a dataset: Select the interesting attributes Data cleaning and preprocessing 3. Choose the data mining task and the specific algorithm 4. Interpret the results, and possibly return to 2
  • 20. HOW DATA MINING IS USED 1. Identify the problem 2. Use data mining techniques to transform the data into informatio 3. Act on the information 4. Measure the results

Editor's Notes

  1. The slides for this text are organized into several modules. Each lecture contains about enough material for a 1.25 hour class period. (The time estimate is very approximate--it will vary with the instructor, and lectures also differ in length; so use this as a rough guideline.) This lecture is the first of two in Module (1). Module (1): Introduction (DBMS, Relational Model) Module (2): Storage and File Organizations (Disks, Buffering, Indexes) Module (3): Database Concepts (Relational Queries, DDL/ICs, Views and Security) Module (4): Relational Implementation (Query Evaluation, Optimization) Module (5): Database Design (ER Model, Normalization, Physical Design, Tuning) Module (6): Transaction Processing (Concurrency Control, Recovery) Module (7): Advanced Topics